Human chromosome 16 genes, compositions, methods of making and using same

ABSTRACT

In accordance with the present invention, there are provided isolated nucleic acids encoding a human netrin, a human ATP binding cassette transporter, a human ribosomal L3 subtype, and a human augmenter of liver regeneration as well as isolated protein products encoded thereby. The present invention provides nucleic acid probes that hybridize to invention nucleic acids as well as isolated nucleic acids comprising unique gene sequences located on chromosome 16. Further provided are vectors containing invention nucleic acids, host cells transformed therewith, as well as transgenic non-human mammals that express invention polypeptides. The present invention includes antisense oligonucleotides, antibodies and compositions containing same. Additionally, the invention provides methods for identifying compounds that bind to invention polypeptides.

This invention was made in part with Government support under Grant No.DK44853, from the National Institutes of Health. The Government may havecertain rights in this invention.

This application is a continuation-in-part of U.S. application Ser. No.08/665,259, filed Jun. 17, 1996, currently pending, which is acontinuation-in-part of U.S. application Ser. No. 60/000,596, filed Jun.30, 1995.

BACKGROUND OF THE INVENTION

The assembly of contiguous cloned genomic reagents is a necessary stepin the process of disease-gene identification using a positional cloningapproach. The rapid development of high density genetic maps based onpolymorphic simple sequence repeats has facilitated contig assemblyusing sequence tagged site (STS) content mapping. Most contigconstruction efforts have relied on yeast artificial chromosomes (YACs),since their large insert size uses the current STS map density moreadvantageously than bacterial-hosted systems. This approach has beenvalidated for multiple human chromosomes with YAC coverage ranging from65-95% for many chromosomes and contigs of 11 to 36 Mb being described(Chumakov et al., Nature 377 (Supp.):175-297, 1995; Doggett et al.,Nature 377 (Supp.):335-365, 1995b; Gemmill et al., Nature 377(Supp.):299-319, 1995; Krauter et al., Nature 377 (Supp.):321-333, 1995;Shimizu et al., Cytogenet. Cell Genet. 70:147-182, 1995; van-Heyningenet al., Cytogenet. Cell Genet. 69:127-158, 1995).

Despite numerous successes, the YAC cloning system is not a panacea forcloning the entire genome of complex organisms due to intrinsiclimitations that result in substantial proportions of chimeric clones(Green et al., Genomics 11:658-669, 1991; Bellanne-Chantelot et al.,Cell 70:1059-1068, 1992; Nagaraja et al., Nuc. Acids Res. 22:3406-3411,1994), as well as clones that are rearranged, deleted or unstable (Neilet al., Nuc. Acids Res. 18:1421-1428, 1990; Wada et al., Am. J. Hum.Genet. 46:95-106, 1990; Zuo et al., Hum. Mol. Genet. 1:149-159, 1992;Szepetowski et al., Cytogenet. Cell Genet. 69:101-107, 1995). At leastsome of these cloned artifacts are a product of the recombinationalmachinery of yeast acting on the various types of repetitive elements inmammalian DNA (Neil et al., supra. 1990; Green et al., supra. 1991;Schlessinger et al., Genomics 11:783-793, 1991; Ling et al., Nuc. AcidsRes. 21:6045-6046, 1993; Kouprina et al., Genomics 21:7-17, 1994;Larionov et al., Nuc. Acids Res. 22:4154-4162, 1994).

Accordingly, alternative cloning systems must be used in concert withYAC-based approaches to complement localized YAC cloning deficiencies,to enhance the resolution of the physical map, and to provide asequence-ready resource for genome-wide DNA sequencing. Several exontrapping methodologies and vectors have been described for the rapid andefficient isolation of coding regions from genomic DNA (Auch et al.,Nuc. Acids Res. 18:6743-6744, 1990; Duyk et al., Proc. Natl. Acad. Sci.,USA 87:8995-8999, 1990; Buckler et al., Proc. Natl. Acad. Sci., USA88:4005-4009, 1991; Church et al., Nature Genet. 6:98-105, 1994). Themajor advantage of exon trapping is that the expression of clonedgenomic DNAs (cosmid, P1 or YAC) is driven by a heterologous promoter intissue culture cells. This allows for coding sequences to be identifiedwithout prior knowledge of their tissue distribution or developmentalstage of expression. A second advantage of exon trapping is that exontrapping allows for the identification of coding sequences from only thecloned template of interest, which eliminates the risk of characterizinghighly conserved transcripts from duplicated loci. This is not the casefor either cDNA selection or direct library screening.

Exon trapping has been used successfully to identify transcribedsequences in the Huntington's disease locus (Ambrose et al., Hum. Mol.Genet. 1:697-703, 1992; Taylor et al., Nature Genet. 2:223-227, 1992;Duyao et al., Hum. Mol. Genet. 2:673-676, 1993) and BRCA1 locus (Brodyet al., Genomics 25:238-247, 1995; Brown et al., Proc. Natl. Acad. Sci.,USA 92:4362-4366, 1995). In addition, a number of disease-causing geneshave been identified using exon trapping, including the genes forHuntington's disease (The Huntington's Disease Collaborative ResearchGroup, Cell 72:971-983, 1993), neurofibromatosis type 2 (Trofatter etal., Cell 72:791-800, 1993), Menkes disease (Vulpe et al., Nature Genet.3:7-13, 1993), Batten Disease (The International Batten DiseaseConsortium, Cell 82:949-957, 1995), and the gene responsible for themajority of Long-QT syndrome cases (Wang et al., Nature Genet. 12:17-23,1996).

A 700 kb CpG-rich region in band 16p13.3 has been shown to contain thedisease gene for .sup.˜ 90% of the cases of autosomal dominantpolycystic kidney disease (PKD1) (Germino et al., Genomics 13:144-151,1992; Somlo et al., Genomics 13:152-158, 1992; The European PolycysticKidney Disease Consortium, Cell 77:881-894, 1994) as well as the tuburingene (TSC2), responsible for one form of tuberous sclerosis (TheEuropean Chromosome 16 Tuberous Sclerosis Consortium, Cell 75:1305-1315,1993). An estimated 20 genes are present in this region of chromosome 16(Germino et al., Kidney Int. Supp. 39:S20-S25, 1993). Characterizationof the region surrounding the PKD1 gene in 16p13.3, however, has beencomplicated by duplication of a portion of the genomic interval moreproximally at 16p13.1 (The European Polycystic Kidney DiseaseConsortium, supra. 1994).

This chromosomal segment serves as a challenging test for large-insertcloning systems in E. coli and yeast since it resides in a GC-richisochore (Saccone et al., Proc. Natl. Acad. Sci., USA 89:4913-4917,1992) with an abundance of CpG islands (Harris et al., Genomics7:195-206, 1990; Germino et al., supra. 1992), genes (Germino et al.,supra. 1993) and Alu repetitive sequences (Korenberg et al., Cell53:391-400, 1988). Chromosome 16 also contains more low-copy repeatsthan other chromosomes with almost 25% of its cosmid contigs hybridizingto more than one chromosomal location when analyzed by fluorescence insitu hybridization (FISH) (Okumura et al., Cytogenet. Cell Genet.67:61-67, 1994). These types of repeats and sequence duplicationsinterfere with "chromosome walking" techniques that are widely used foridentification of genomic DNA and pose a challenge tohybridization-based methods of contig construction. This is becausethese techniques rely on hybridization to identify clones containingoverlapping fragments of genomic DNA; thus, there is a high likelihoodof "walking" into clones derived from homologues instead of clonesderived from the authentic gene. In a similar manner, the sequenceduplications and chromosome 16-specific repeats also interfere with theunambiguous determination of a complete cDNA sequence that encodes thecorresponding protein. Furthermore, low copy repeats may lead toinstability of this interval in bacteria, yeast and higher eukaryotes.

Thus, there is a need in the art for methods and compositions whichenable accurate identification of genomic and cDNA sequencescorresponding to authentic genes present on highly repetitive portionsof chromosome 16, as well as genes similarly situated on otherchromosomes. The present invention satisfies this need and providesrelated advantages as well.

SUMMARY OF THE INVENTION

In accordance with the present invention, there are provided isolatednucleic acids encoding a human netrin, a human ATP binding cassettetransporter, a human ribosomal L3 subtype, and a human augmenter ofliver regeneration.

The present invention further provides isolated protein products encodedby a human netrin gene, a human ATP binding cassette transporter gene, ahuman ribosomal L3 gene, and a human augmenter of liver regenerationgene.

Additionally, the present invention provides nucleic acid probes thathybridize to invention nucleic acids as well as isolated nucleic acidscomprising unique gene sequences located on chromosome 16.

Further provided are vectors containing invention nucleic acids as wellas host cells transformed with invention vectors.

Transgenic non-human mammals that express invention polypeptides areprovided by the present invention.

The present invention includes antisense oligonucleotides, antibodiesand compositions containing same.

Additionally, the invention provides methods for identifying compoundsthat bind to invention polypeptides. Such compounds are useful formodulating the activity of invention polypeptides.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 shows a schematic diagram of the P1 contig and trapped exons.

FIGS. 2A and 2B show an alignment of selected exon traps with sequencesin the databases.

FIGS. 3A, 3B, and 3C show 6803 bp of hNET genomic sequence from P1 clone53.8B (SEQ ID NO:19).

FIGS. 4A and 4B show 1743 bp of hNET cDNA and deduced amino acidsequence coding for a human homologue of chicken netrin genes (SEQ IDNOs:20 and 21).

FIGS. 4C and 4D show the nucleotide sequence of the 1.9 kb hNET cDNAincluding both 5' and 3' UTRs (SEQ ID NO:78).

FIG. 5 shows an amino acid comparison between chicken netrin-1 (SEQ IDNO:22), chicken netrin-2 (SEQ ID NO:23) and hNET (SEQ ID NO:21). Shadedboxes denote regions of identical homology. The laminin domains V and VIand the C-terminal domain (C) are indicated by arrows with domain Vdivided into three sub-components (V-1 to V-3). The asterisks identify amotif for adhesion/signaling receptors.

FIG. 6 shows a graphical representation of the homology between domainsof chicken netrin-1, chicken netrin-2 and hNET.

FIG. 7 shows exon traps, RT-PCR products and cDNA from the ABCgt.1clone. Exon traps are shown above. ABCgt.1 DNA is shown below the exontraps with the position of the Genetrapper selection (S) and repair (R)oligonucleotides indicated. The position of the RT-PCR clones are shownbelow the cDNA.

FIGS. 8A-8H show 5.8 kb of cDNA and deduced amino acid sequence encodingABCgt.l clone (SEQ ID NOs:24 and 25).

FIGS. 9A-9D show an amino acid alignment of murine ABC1 (SEQ ID NO:26)and ABC2 (SEQ ID NO:27) with clone ABCgt.1 (SEQ ID NO:25). Hyphensdenote gaps; asterisks denote identical residues, while periods denoteconservative substitutions. The location of the ATP binding cassettes isshown by the boxed regions. Numbers at the right show the relativeposition of the proteins.

FIG. 10 shows the region of the transcriptional map of the PKD1 locusfrom which P1 clones 49.10D, 109.8C and 47.2H were isolated. The openboxes represent trapped exons with their relative position indicatedbelow the RPL3L (SEM L3) gene. c, r and h identify the location of thecapture, repair and hybridization oligonucleotides, respectively.

FIGS. 11A-11B show the nucleotide and deduced amino acid sequence of theSEM L3 cDNA, now designated RPL3L (SEQ ID NOs:28 and 29). The 5'upstream inframe stop codon is underlined and the arrows indicate thesite of the polyA tract of the two shorter cDNA clones that were alsoisolated.

FIG. 12 shows a comparison of the deduced amino acid sequences fromhuman (SEQ ID NO:30), bovine (SEQ ID NO:31), murine (SEQ ID NO:32) andthe RPL3L (SEM L3) (SEQ ID NO:29) genes. Dashes indicate sequenceidentity to the human L3 gene. The nuclear targeting sequence at theN-terminal end is shaded and the bipartite motif is boxed.

FIG. 13 shows the nucleotide and deduced amino acid sequence of the hALRcDNA (SEQ ID NO:33 and 34).

FIG. 14 shows a comparison of the deduced amino acid sequences from ratALR and human ALR (SEQ ID NOs:35 and 34), respectively.

FIGS. 15A-15J show the nucleotide and deduced amino acid sequence offull-length hABC3 cDNA (SEQ ID NOs:74 and 75).

FIG. 16 shows a physical map of the region containing the hABC3 gene.

FIG. 17A shows the deduced amino acid sequence for hABC3 (SEQ ID NO:75)aligned to the murine ABC1 (SEQ ID NO:26) and ABC2 (SEQ ID NO:27)sequences (Luciani et al., Genomics 21:150-159, 1994) and sequencepredicted to be encoded by C. elegans cosmid C.48B4.4 (SEQ ID NO:77)(Wilson et al., Nature 368:32-38, 1994). Sequence identity is shown byletters, with mismatches denoted as periods. Gaps inserted during thealignment are also shown (=). For ABC1, ABC2 and C.48B4.4, only thosesequences included in, and C-terminal to, the first ATP-binding domainare shown. Boxes denote the ATP binding cassettes (I and III) and theHH1 domain (II).

FIG. 17B shows a schematic diagram of the ABC3 protein showing thetransmembrane (TM) domains, ATP binding cassette (ABC) domains, Linkerand HH1 domains.

FIG. 18 shows a map of the genomic interval surrounding the human netringene.

FIG. 19A shows a GRAIL2 analysis of coding sequences in the 6.8 kbgenomic sequence from 53.8B P1.

FIG. 19B shows the results of a Pustell DNA/protein matrix comparinggenomic sequence to chicken netrin-2.

FIG. 20A shows alignment of the human netrin with chicken netrin-1,chicken netrin-2 and UNC-6 (SEQ ID NO:79).

FIG. 20B shows a schematic of the genomic sequence with boxesrepresenting exons and lines denoting the introns. Untranslated regionis shown in black, with the location of the start codon indicated by thearrow. The domain structure of the human netrin protein is shown belowthe gene structure. The position of introns in the Drosophila netringenes is shown by arrows, with the non-conserved intron being denoted bythe open arrow.

DETAILED DESCRIPTION OF THE INVENTION

All patent applications, patents, and literature references cited inthis specification are hereby incorporated by reference in theirentirety. In case of conflict or inconsistency, the present description,including definitions, will control.

Definitions

1. "complementary DNA (cDNA)" is defined herein as a single-stranded ordouble-stranded intronless DNA molecule that is derived from theauthentic gene and whose sequence, or complement thereof, encodes aprotein.

2. As referred to herein, a "contig" is a continuous stretch of DNA orDNA sequence, which may be represented by multiple, overlapping, clonesor sequences.

3. As referred to herein, a "cosmid" is a DNA plasmid that can replicatein bacterial cells and that accommodates large DNA inserts from about 30to about 51 kb in length.

4. The term "PI clones" refers to genomic DNAs cloned into vectors basedon the P1 phage replication mechanisms. These vectors generallyaccommodate inserts of about 70 to about 105 kb (Pierce et al., Proc.Natl. Acad. Sci., USA, 89:2056-2060, 1992).

5. As used herein, the term "exon trapping" refers to a method forisolating genomic DNA sequences that are flanked by donor and acceptorsplice sites for RNA processing.

6. "Amplification" of DNA as used herein denotes a reaction that servesto increase the concentration of a particular DNA sequence within amixture of DNA sequences. Amplification may be carried out usingpolymerase chain reaction (PCR) (Saiki et al., Science, 239:487, 1988),ligase chain reaction (LCR), nucleic acid-specific based amplification(NSBA), or any method known in the art.

7. "RT-PCR" as used herein refers to coupled reverse transcription andpolymerase chain reaction. This method of amplification uses an initialstep in which a specific oligonucleotide, oligo dT, or a mixture ofrandom primers is used to prime reverse transcription of RNA intosingle-stranded cDNA; this cDNA is then amplified using standardamplification techniques e.g. PCR.

A P1 contig containing approximately 700 kb of DNA surrounding the PKD1and TSC2 gene was assembled from a set of 12 unique chromosome16-derived P1 clones obtained by screening a 3 genome equivalent P1library (Shepherd et al., Proc. Natl. Acad. Sci., USA 91:2629-2633,1994) with 15 distinct probes. Exon trapping was used to identifytranscribed sequences from this region in 16p13.3.

96 novel exon traps have been obtained containing sequences from aminimum of eighteen genes in this interval. The eighteen identifiedgenes include five previously reported genes from the interval and apreviously characterized gene whose location was unknown (Table I).Additional exon traps have been mapped to genes based on their presencein cDNAs, RT-PCR products, or their hybridization to distinct mRNAspecies on Northern blots.

                                      TABLE I                                     __________________________________________________________________________    Database Homologies                                                                                                          Accession                                                                     Number                             Independent    Transcript                  of Best                        Gene.sup.a                                                                        Exon Traps.sup.b                                                                    Clone.sup.c                                                                            Size  Database Homology.sup.d                                                                             Hit.sup.e                                                                          P value.sup.f             __________________________________________________________________________    A   6     2 kb (cDNA)                                                                            8 kb  Probable protein kinase [ S. cerevisiae]                                                            Z48149                                                                             6.3e-83                   B   1     1.3 kb (cDNA)                                                                          2.5   No Significant homology                              C   1     0.55 kb (Exon Trap)                                                                    1.4 kb                                                                              N-acetylglucosamine-6-phosphate deacetylase                                                         P34480                                                                             7.4e-73                             0.6 kb (3' RACE)                                                                             elegans]                                             D   2     Exon trap (1.59 bp)                                                                    --    Netrin-2 [G. gallus]  B54665                                                                             3.7e-11                             Exon trap (196 bp)                                                                     --    Netrin-2 [G. gallus]  B54665                                                                             6.1e-33                   E   1     Exon trap (100 bp)                                                                     --    ABC1 gene product [M. musculus]                                                                     P41233                                                                             0.0047                    F   3     1.1 kb (RT-PCR)                                                                        7 kb  ABC2 gene product [M. musculus]                                                                     P41234                                                                             3.0e-28                       2     2.8 kb (cDNA)                                                                          7 kb  ABC1 gene product [M. musculus]                                                                     P41233                                                                             7.1e-65                   G   2     1.8 kb (cDNA)                                                                          2.5 kb                                                                              RNA-Binding protein [Homo sapiens]                                                                  L37368                                                                             2.6e-176                  H   2     1.2 kb (RT-PCR)                                                                        2.5 kb                                                                              phi AP3 [M. musculus] S41688                                                                             2.9e-169                  I   1     0.45 kb (Exon Trap)                                                                    3.0 + 4.5 kb                                                                        No significant homologies                            J   2     0.24 kb (RT-PCR)                                                                       2 kb  Rab26 [R. norvegicus] U18771                                                                             3.6e-56                   K   1     Exon trap (219 bp)                                                                     §                                                                              40S Ribosomal protein S4 [Homo                                                                      P15880s]                                                                           7.3e-18                   L   5     1.7 kb (cDNA)                                                                          1.6 kb                                                                              60s Ribosomal protein L3 [Homo                                                                      S34195s]                                                                           6.73-233                  M   1     0.7 kb (cDNA)                                                                          1.3 kb                                                                              Hypothetical 17.2 Kd protein [C.                                                                    P34436s]                                                                           6.2e-10                   __________________________________________________________________________     .sup.a Gene as denoted in FIG. 1.                                             .sup.b Number of the trapped exon present in cloned cDNA or PCR product.      .sup.c Size of clone with type of clone indicated in parentheses.             .sup.d Significant homology in databases as determined by BLASTX.             .sup.e Accession Number of best hit.                                          .sup.f Smallest sum probability for the best database match.                  -- Northern analysis was not performed due to the small size of the exon      traps.                                                                        § Up to 200 copies of LLREP3 are present in the genome.             

Exon trapping was performed using an improved trapping vector (Burn etal., Gene 161:183-187, 1995), with the resulting exon traps beingcharacterized by DNA sequence analysis. In order to determine therelative efficiency of the exon trapping procedure, exon traps werecompared to the cDNA sequences for those genes known to be in theinterval around the PKD1 gene (FIG. 1). Single exon traps were obtainedfrom the human homologue of the ERV1 (Lisowsky et al., Genomics29:690-697, 1995) and the ATP6C proton pump genes (Gillespie et al.,Proc. Natl. Acad. Sci., USA 88:4289-4293, 1991). The horizontal line atthe top of FIG. 1 shows the position of relevant DNA markers with thescale (in kilobases). The position of NotI sites is shown below thehorizontal line. The position and orientation of the known genes isindicated by arrows with the number of exon traps obtained from eachgene shown in parentheses. The position of the transcription unitsdescribed in this report (A through M) are shown below the known genes.The Genbank Accession numbers of corresponding exon traps are shownbelow each transcriptional unit. P1 clones are indicated by theoverlapping lines with the name of the clone shown above the line. Theposition of trapped exons which did not map to characterized transcriptsare shown below the P1 contig. Vertical lines denote the interval withinthe P1 clone(s) detected by the exon traps in hybridization studies.

In contrast, eight individual exon traps were isolated from the TSC2gene and ten from the CCNF gene (The European Chromosome 16 TuberousSclerosis Consortium, supra. 1993; Kraus et al., Genomics 24:27-33,1994). Trapped sequences from three of the exons present in the PKD1gene were obtained (The American PKD1 Consortium, Hum. Mol. Genet.4:575-582, 1995; The International Polycystic Kidney Disease Consortium,Cell 81:289-298, 1995; Hughes et al., Nature Genet. 10:151-160, 1995).16 additional exon traps from the 109.8C and 47.2H P1 clones were alsoobtained.

Sequences present in two exon traps (Genbank Accession Nos. L75926 andL75927), localizing to the region of overlap between the 96.4B and64.12C P1 clones, were shown to contain sequences from the previouslydescribed human homologue to the murine RNPS1 gene (Genbank AccessionNo. L37368), encoding an S phase-prevalent DNA/RNA-binding protein(Schmidt et al., Biochim. Biophys. Acta 1216:317-320, 1993). Acomparison of these exon traps to the dbEST database indicated that theywere also contained in cDNA 52161 from the I.M.A.G.E. Consortium (Lennonet al., Genomics 33:151-152, 1996). Based on these data, the hRNPS1 genecan be mapped to 16p13.3 near DNA marker D16S291 (transcript G in FIG.1).

Two exon traps from the 1.8F P1 clone were found to have a high level ofhomology to the previously described murine ΦAP3 encoding a zincfinger-containing transcription factor (Fognani et al., EMBO J.12:4985-4992, 1993). The mΦAP3 protein, a zinc finger-containingtranscription factor, is believed to function as a negative regulatorfor genes encoding proteins responsible for the inhibition of cellcycling (Fognani et al., supra.). The two exon traps were linked by PCR,with the resulting 1.2 kb PCR product being 85% identical at thenucleotide level to the murine ΦAP3 cDNA. Hybridization of the(ΦAP3-like exon traps to the dot blotted P1 contig indicated that thegene lies in the non-overlapping region of the 1.8F P1, between the DNAmarkers KLH7 and GGG12 (transcript H in FIG. 1).

Significant homology was also seen between two exon traps obtained fromthe 97.10G P1 and the rat Rab26gene encoding a ras-related GTP-bindingprotein involved in the regulation of vesicular transport (Nuoffer etal, Ann. Rev. Biochem. 63:949-990, 1994; Wagner et al., Biochem.Biophys. Res. Comm. 207:950-956, 1995). The Rab26-like exon traps werelinked by RT-PCR (transcript J in FIG. 1) with the encoded sequencesbeing 94% (83/88) identical at the protein level to Rab26. See, forexample, FIG. 2 showing an alignment of the following selected exontraps with sequences in the databases. An alignment of sequences encodedby exon trap L48741 (SEQ ID NO:1) and N-acetylglucosamine-6-phosphatedeacetylase from C. elegans (SEQ ID NO:2), E. coli (SEQ ID NO:3) andHaemophilus (SEQ ID NO:4). The EGF repeat from netrin-1 (SEQ ID NO:7),netrin-2 (SEQ ID NO:6) and UNC-6 (SEQ ID NO:8) are shown aligned to oneof the translated netrin-like exon traps (Genbank Accession No. L75917)(SEQ ID NO:5). An alignment of sequences from the second netrin-likeexon trap (Genbank Accession No. L75916) (SEQ ID NO:9) and netrin-1 (SEQID NO:11) and netrin-2 (SEQ ID NO:10) is shown. An alignment of thetranslated Rab26-like RT-PCR product (Genbank Accession Nos.L48770-L48771) (SEQ ID NO:12) and rat Rab26 (SEQ ID NO:13). Sequencesencoded by exon trap L48792 (SEQ ID NO:14) are shown aligned tosequences from the pilB transcriptional repressor from Neisseriagonorrhoeae (SEQ ID NO:15), sequences predicted by computer analysis tobe encoded by cosmid F44E2.6 from C. elegans (SEQ ID NO:17), the YCL33Cgene product from yeast (Genbank Accession No. P25566) (SEQ ID NO:16),and a transcriptional repressor from Haemophilus (SEQ ID NO:18). Periodsdenote positions where gaps were inserted in the protein sequence inorder to maintain alignment.

In order to correlate exon traps with individual transcripts, cDNAlibrary screening and PCR based approaches were used to clonetranscribed sequences containing selected exon traps. RT-PCR was used tolink individual exon traps together in cases where the two exon trapshad homology to similar sequences in the databases. In cases where onlysingle exon traps were available, 3' RACE or cDNA library screening wasused to obtain additional sequences. Sequences from the exon traps andcloned products were used to map the position, and when possible theorientation, of the corresponding transcription units.

Six unique exon traps, containing sequences from at least eight exons,were shown to be from a transcriptional unit in the centromeric most P1clone, 94.10H (transcript A in FIG. 1). A 2 kb cDNA linking the six exontraps was isolated and shown to hybridize to an 8 kb transcript.Additional hybridization studies indicated that the gene was orientedcentromeric to telomeric, with at least 6 kb of the transcriptoriginating from sequences centromeric of the P1 contig. Extensivehomology was observed between the translated cDNA and a variety ofprotein kinases; however, the presence of the conserved HRDLKPEN motif(SEQ ID NO:71) encoded in exon trap L48734, as well as the partial cDNA,suggests that it encodes a serine/threonine kinase (van-der-Geer et al.,Ann. Rev. Cell Bio. 10:251-337, 1994).

cDNAs were isolated using sequences derived from a separate 94.10H exontrap (Genbank Accession No. L48738) and the position and orientation ofthe corresponding transcription unit were determined. Two cDNA specieswere obtained using exon trap L48738 as a probe, with the only homologybetween the two species arising from the 109 bases contained in the exontrap. Using oligonucleotide probes, the transcription unit was mapped toa position near the 26-6DIS DNA marker, in a telomeric to centromericorientation; however, only one of the cDNA species mapped to the P1contig (transcript B in FIG. 1). Based on these data, it is likely thatthe second cDNA species originated from a region outside of the P1contig, possibly from the duplicated 26-6PROX marker located furthercentromeric in 16p13.3 (Gillespie et al., Nuc. Acids Res. 18:7071-7075,1990).

The 110.1F P1 clone contains at least two genes in addition to the ATP6Cgene. Using BLASTX to search the protein databases, significant homologywas observed between sequences encoded by exon trap L48741 and theN-acetylglucosamine-6-phosphate deacetylase (nagA) proteins from C.elegans (Wilson et al., supra. 1994), E. coli (Plumbridge, Mol.Microbiol. 3:505-515, 1989) and Haemophilus (Fleischmann et al., Science269:496-512, 1995). An alignment of the nagA proteins to the translatedexon trap revealed the presence of multiple conserved regions (FIG. 2),suggesting that the exon trap contains sequences from the human nagAgene. Additional sequences from the nagA-like transcript have beencloned using 3' RACE and the transcription unit mapped to a regionbetween NotI sites 2 and 3 in FIG. 1. The gene is oriented telomeric tocentromeric with NotI site 2 being present in the 3' UTR of the RACEclone (transcript C in FIG. 1).

Two additional exon traps (Genbank Accession Nos. L75916 and L75917),mapping to the region of overlap between the 110.1F and 53.8B P1 clones(transcript D in FIG. 1), were shown to have homology with the chickennetrins (Kennedy et al., Cell 78:425-435, 1994; Serafini et al., Cell78:409-424, 1994) and the C. elegans UNC-6 protein (Ishii et al., Neuron9:873-881, 1992)(FIGS. 2 and 20A).

Sequences encoded by exon trap, L75917, were shown to have significanthomology with the C-terminal most epidermal growth factor (EGF) repeatfound in the netrin and UNC-6 proteins (FIGS. 2 and 20A). Exon trapL75917encodes sequences which are 98% identical to sequences from thethird epidermal growth factor (EGF) repeat of chicken netrin-2 and 90%identical to sequences from the same region of netrin-1. The netrin-liketrap, L75916, encodes sequences from the more divergent C-terminaldomain of the netrins which are 43% identical to sequences contained inthe C-terminal domain of netrin-1 and netrin-2 (FIGS. 2 and 20A). Thisregion is the least conserved between UNC-6 and the netrins, withsequences being 63% conserved between netrin-1 and netrin-2 and 29%conserved between netrin-2 and UNC-6 (Serafini et al., supra.).

The netrins define a family of chemotropic factors which have been shownto play a central role in axon guidance. Axonal growth cones are guidedto their target by both local cues, present in the extracellular matrixor on the surface of cells, and long-range cues in the form ofdiffusible chemoattractants and chemorepellents (Goodman and Shatz, Cell72:77-98, 1993; Keynes and Cook, Curr. Opin. Neurobiol. 5:75-82, 1995).

Chicken netrin-1 and netrin-2 have been shown to function aschemoattractants for developing spinal commissural axons (Serafini etal., Cell 78:409-424, 1994; Kennedy et al., Cell 78:425-435, 1994) withnetrin-1 also acting as a chemorepellant for trochlear motor axons(Colamarino and Tessier-Lavigne, Cell 81:621-629, 1995). Comparativeanalysis revealed the presence of extensive homology between the chickennetrins and C. elegans UNC-6 protein which is required forcircumferential cell migration and axon guidance (Hedgecock et al.,Neuron 4:61-85, 1990; Ishii et al., Neuron 9:873-881, 1992). Morerecently, two Drosophila netrins, NETA and NETB, have been described andshown to be required for commissural axon guidance as well as forguidance of motor neurons to their target muscles (Harris et al., Cell17:217-228, 1996; Mitchell et al., Cell 17:203-215, 1996). These studiesindicate that the netrin family of chemoattractant and chemorepellantproteins is conserved between invertebrates and vertebrates.

The genomic interval containing the netrin-like exon traps was sequencedin order to obtain additional sequence information from the gene and torule out the possibility that the exon traps were derived from apseudogene. In preliminary studies using the 53.8B genomic P1 clone, thenetrin-like exon traps were mapped to a 6 kb XhoI fragment. See, forexample, FIG. 18 wherein relevant DNA markers are shown on top of thehorizontal line, with NotI sites (N) being shown below the line. Thelocation and orientation of the ATP6C, CCNF, and nagA transcriptionalunits have been previously described (Gillespie et al., Proc. Natl.Acad. Sci., USA 88: 4289-4293, 1991; Kraus et al., Genomics 24: 27-33,1994; Burn et al., Genome Research 6: 525-537, 1996) and are shown belowthe genomic interval. The two P1 clones containing the netrin gene areshown below the schematic diagram of the interval. The location of the6.8 kb of genomic sequence is enlarged below the P1 clones. The positionof the two exon traps in the 6.8 kb of genomic sequence is alsoindicated.

The 6 kb fragment, and the adjacent 3.5 kb XhoI fragment, were subclonedand used to screen a random shotgun library from the 53.8B P1 clone.Subclones which were positive by hybridization were sequenced withforward and reverse vector primers. A total of 88 subclones weresequenced in this manner.

Additional sequence was obtained using internal primers as well as endsequence from the parental XhoI fragments. A total of 6.8 kb of genomicsequence with an overall redundancy of 7-fold was sequenced. TheGC-content for the sequenced region was found to be 68.9%, which isslightly higher than the 62.8% observed for the 53 kb of genomicsequence from the PKD1 gene, located 350 kb further telomeric (TheAmerican PKD1 Consortium, 1995, supra; Burn et al., 1996, supra).

Computer analyses were performed to identify putative exons. GRAIL2analysis predicted six exons within the 6.8 kb of genomic sequence withdatabase analysis indicating that all but one exon (exon 1), encodedsequences with homology to the chicken netrins. FIG. 19A shows a GRAIL2analysis of coding sequences in the 6.8 kb of genomic sequence from the53.8B P1, with the gray scale denoting GC-content (white to light grayis GC rich and gray to black is AT rich), vertical boxes indicatingrelative quality of the predicted exons. A graphical depiction of thepredicted exons is shown above the vertical boxes with light coloredboxes denoting exons with a score of "excellent" (>80% probability) anddark colored boxes denoting exons with a score of "good" (>60%probability). The position of exon traps L75917 and L75916 (left toright, respectively) are shown above the GRAIL2 predicted exons. Thestructure of the gene based on comparison of the RT-PCR products andgenomic sequence is shown at the top, the position of the exons in thegenomic sequence is shown by the numbers above the exons. The 5' and 3'untranslated regions are also shown.

Additionally, the 6.8 kb of genomic sequence was compared to the proteinsequences of the chicken netrins using a Pustell DNA/protein matrix. Thegenomic sequence (translated in all six frames) was compared to chickennetrin-2 in FIG. 19B, using a PAM250 matrix with the minimum homologyset at 50% and the window set at 20. Regions of homology are shown byheavy diagonal lines. Five exons were predicted by this analysis, withonly the first GRAIL2 predicted exon not appearing to be bona fide.Sequences from the two exon traps were also predicted by GRAIL2;however, there were noteworthy differences (cf FIG. 19A). In predictingsequences present in exon trap L75917, GRAIL2 included an additional 55bp at the 5' end of the exon. The first of the two exons present in exontrap L75916 was not predicted by GRAIL2, while GRAIL2 added additionalbases to the 5' and 3' ends of the second exon present in this exontrap.

A search of the Expressed Sequence Tags (EST) database did not revealthe presence of any ESTs from the human netrin gene. Nor was the humannetrin message detected by Northern and/or RNA dot blot analysis usingmRNA from over fifty different adult and fetal tissues, suggesting thathNET has an extremely restricted pattern of expression and whenexpressed is present in low abundance. Two murine ESTs, however, wereidentified from a brain library and a whole fetus library (GenbankAccession Nos. W59766 and AA048205, respectively) which have significanthomology to hNET. The murine ESTs contain overlapping sequence with atotal of 477 bp of contiguous sequence being represented. This 477 bpcontiguous sequence aligns to the 5' end of the human netrin cDNA andincludes 47 bp of 5' UTR and sequences encoding the N-terminal 143 aminoacids. A comparison of the deduced human and murine protein sequenceindicated that the two proteins were 89.5% (128/143) identical.

Characterization of the Human Netrin Transcript

In order to confirm the structure of the netrin gene, RT-PCR wasperformed using primers designed from the predicted exons. Since thepredicted human netrin appeared to slightly more homologous to netrin-2than netrin-1 (57% versus 54%, respectively) and netrin-2 is expressedin the spinal cord of chicken, adult human spinal cord polyA+ RNA wasutilized as a template. RT-PCR products were obtained with only aportion of the primer pairs; however, even this required the use ofnested primers and two rounds of PCR, with low yields making itnecessary to use hybridization and radiolabeled probes to visualize theproducts. The low yield, and lack of RT-PCR products in some cases, wasattributed to the high GC-content of the products (70-80%). The additionof betaine to a final concentration of 2.5 M in the PCR reactions wasfound to dramatically improve yield and purity of the RT-PCR products.(International Publication No. WO 96/12041; Reeves et al. (1994) Am. J.Hum. Genet. 55:A238; Baskaran et al. (1996) Genome Research 6:633-638).

Assembly of the RT-PCR products revealed a 1743 bp open reading frame(ORF) with an in-frame stop codon upstream of the proposed startmethionine. In verifying the start and stop codons, a 209 bp 5' UTR anda 22 bp 3' UTR were cloned. Additional sequences from the respectiveUTRs were not cloned, however, since the goal of the RT-PCR experimentswas to only confirm the predicted protein sequence and not to assemble afull-length cDNA. The position of the intron-exon boundaries wasdetermined based on the comparison of the genomic sequence and theRT-PCR clones (FIG. 19A).

A 1.9 kb cDNA, hNET, was cloned by performing nested PCR using spinalcord cDNA as template and standard PCR conditions with the addition ofbetaine. The human netrin protein is predicted to be 580 amino acids insize, with the common domain structure of the netrin family beingconserved. In FIG. 20A positions where the chicken netrins and UNC-6sequences match the human sequence are denoted by periods while gapsintroduced during the alignment are shown by hyphens. Arrows above thesequence alignment show the boundaries of the laminin VI and V domains,and C-terminal region (C) as described (Serafini et al., Cell 78:409-424, 1994). The signal sequence (S) is also shown. V-1, V-2, and V-3designate each of the EGF domains that constitute domain V. The hNETcoding sequence and its predicted protein product are shown in FIGS. 4Aand 4B. FIGS. 4C and 4D show full length hNET cDNA including both 5' and3' UTR sequence.

Several lines of evidence rule against the possibility that the humannetrin gene described herein represents a pseudogene. First, none of theexons in the coding region contain stop codons. Secondly, the overallgene structure described is highly conserved when compared to othermembers of the netrin/UNC-6 family. Third, despite the lack of signal inthe Northern and RNA blot analysis, a mature transcript was isolated byRT-PCR. Finally, sequences in the murine EST database have beenidentified which are highly conserved. Taken together, these dataindicate that a novel human netrin gene with a restricted pattern ofexpression has been identified.

Human netrins may have a significant role in neural regeneration. Thoughnetrins do not by themselves promote axon growth, they do play a role inthe orientation of axon growth. The combination of growth promotingactivities with axon guidance cues would be a necessary requisite fordirected neural regeneration.

The ability to clone a gene with such a restricted pattern of expressionpoints out one of the strengths of the exon trapping procedure, since itis unlikely that the netrin gene would have been identified using cDNAselection or direct library screening. These results highlight the needfor using a variety of approaches to identify and clone sequences from alarge genomic contig.

Exon trapping results further show that there is a novel ATP BindingCassette (ABC) transporter in the PKD1 locus located between the LCN1and D16S291 markers in a centromeric to telomeric orientation. Databasesearches with the exon trap sequences show homology to the murine ABC1and ABC2 genes (Luciani et al., supra. 1994). The human homologs ofmurine ABC1 and ABC2 have been cloned and mapped to human chromosome 9(Luciani et al. supra. 1994). Sequences derived from the trapped exonsalong with those from cDNA selection and SAmple SEquencing (SASE) wereused to recover overlapping partial cDNA clones.

Seven exon traps with homology to ABC transporters were isolated from P1clones 30.1F, 64.12C and 96.4B. Additional sequences encoded by the ABC3gene were obtained by RT-PCR (placenta and brain RNA as template) andlibrary PCR (using commercially available lung cDNA library as template)using custom primers designed from the exon traps (Tables II and III).Three exon traps (L48758, L48759 and L48760) were obtained from theregion of overlap between the 30.1F, 64.12C and 96.4B P1 clones(transcript F FIG. 1), while a fourth exon (L48753) maps to the 79.2A P1clone, exclusively (transcript E in FIG. 1).

                                      TABLE II                                    __________________________________________________________________________    Oligonucleotides Used to Clone Additional Sequences                                     SEQ               SEQ                                                         ID                ID                                                Gene.sup.a                                                                        Method.sup.b                                                                        NO:                                                                              Oligonucleotide 1.sup.c                                                                      NO:                                                                              Oligonucleotide 2.sup.d                                                                      clone size.sup.e                __________________________________________________________________________    A   Genetrapper                                                                         36 TGACGCCGTGCCCATCCAGT                                                                         37 CAGCGTGGTGTTATGTTCCT                                                                         2.0 kb                          B   Genetrapper                                                                         38 TTGGGCCTGTGCTGAACTAC                                                                         39 CGGCAAGCTGGTGATTAACA                                                                         1.3 kb                          C   3'RACE                                                                              40 CGGCAGAGGATGCTGTGT                                                                           41 GCGGAGCCACCTTCATCA                                                                           0.6 kb                          F   RT-PCR                                                                              42 GACGCTGGTGAAGGAGC                                                                            43 TCGCTGACCGCCAGGAT                                                                            1.1 kb                          H   RT-PCR                                                                              44 CTGTCGGGAAGGTCTCACTG                                                                         45 GTTCACCGCCTTGGAGGATT                                                                         1.1 kb                          J   RT-PCR                                                                              46 GTGTGGGGAAGACCTGTCTG                                                                         47 AGGAGGCCTTGTTGGTGACA                                                                         0.24 kb                         L   Genetrapper                                                                         48 ACGGACACCTGGGCTTC                                                                            49 AAACGGGAGGAGGTGGA                                                                            1.7 kb                          M   Genetrapper                                                                         50 TGTGGCTATGAGCTGTTCTC                                                                         51 GCAGTCCCGATTCTGAATAT                                                                         0.7 kb                          __________________________________________________________________________     .sup.a Gene as denoted in FIG. 1                                              .sup.b Method used to clone additional sequences. Lifetechnologies            Genetrapper system, 3'RACE and RTPCR.                                         .sup.c Sequence of oligonucleotides used to obtain additional sequences.      For the Genetrapper system, this oligonucleotide was used in the direct       selection step. In the case of 3'RACE experiments, this oligonucleotide       was the external prime. In the case of RTPCR experiments, the designated      oligonucleotide was used as a sense primer.                                   .sup.d Sequnce of oligonucleotides. In the Genetrapper experiments, this      oligonucleotide was used in the repair step. For 3'RACE experiments, this     was the internal primer. For RTPCR experiments, this was the anitsense        primer.                                                                       .sup.e Size of clone obtained using the primer pair.                     

                                      TABLE IIIa                                  __________________________________________________________________________    Oligonucleotides Used to Clone Additional Sequences from human ABC3                 SEQ ID            SEQ ID                    clone                       Method                                                                              NO: Oligonucleotide 1.sup.b                                                                     NO: Oligonucleotide 2.sup.c                                                                      clone name.sup.d                                                                     size.sup.e                  __________________________________________________________________________    Genetrapper                                                                         52  CATTGCCCGTGCTGTCGTG                                                                         53  CATCGCCGCCTCCTTCATG                                                                          ABC3 (gt.1)                                                                          5.8 kb                      RT-PCR                                                                              52  CATTGCCCGTGCTGTCGTG                                                                         54  GCGGAGCCACCTTCATCA                                                                           ABC3 (A12)                                                                           1.7 kb                      RT-PCR                                                                              55  GACGCTGGTGAAGGAGC                                                                           56  ATCCTGGCGGTCAGCGA                                                                            ABC3 (3-12)                                                                          1.1 kb                      RT-PCR                                                                              57  AGGGATTCGACATTGCC                                                                           58  CTTCAGAGACTCAGGGGCAT                                                                         ABC3 (#2)                                                                            0.5 kb                      __________________________________________________________________________     .sup.a Method used to clone additional sequences. Lifetechnologies            Genetrapper system and RTPCR.                                                 .sup.b Sequence of oligonucleotides used to obtain additional sequences.      For the Genetrapper system, this oligonucleotide was used in the direct       selection step. In the case of RTPCR experiments, the designated              oligonucleotide was used as a sense primer.                                   .sup.c Sequence of oligonucleotides. In the Genetrapper experiments, this     oligonucleotide was used in the repair step. For RTPCR experiments, this      was the anitsense primer.                                                     .sup.d Assigned name of the isolated clone.                                   .sup.e Size of clone obtained using the primer pair.                     

                                      TABLE IIIB                                  __________________________________________________________________________    Oligonucleotides Used to Clone Additional Sequences from human ABC3                   SEQ ID                 SEQ ID                     clone               5' clone.sup.a                                                                        NO: 5' primer.sup.b                                                                             3' clone.sup.c                                                                     NO: 3' primer.sup.d clone                                                                                size.sup.f          __________________________________________________________________________    et L48757                                                                             52  CATTGCCCGTGCTGTCGTG                                                                         et L48758                                                                          54  GCGGAGCCACCTTCATCA                                                                            ABC3                                                                                 1.7 kb              et L48758                                                                             55  GACGCTGGTGAAGGAGC                                                                           et L48760                                                                          56  ATCCTGGCGGTCAGCGA                                                                             ABC3                                                                                 1.1 kb              et L48760                                                                             57  GGGATTCGACATTGCC                                                                            et L75924                                                                          58  CTTCAGAGACTCAGGGGCAT                                                                          ABC3                                                                                 0.5 kb              sel. cDNA/SASE                                                                        76  AGCTGGCGCTCCTCCTCT                                                                          et L48757                                                                          53  CATCGCCGCCTCCTTCATG                                                                           ABC3                                                                                 0.9                 __________________________________________________________________________                                                              kb                   .sup.a Clone used to derive the 5' primer.                                    .sup.b Sequence of the sense primer used in the RTPCR reaction.               .sup.c Clone used to derive the 3' primer.                                    .sup.d Sequence of the antisense primer used in the RTPCR reaction.           .sup.e Assigned name of the isolated clone.                                   .sup.f Size of clone obtained using the primer pair.                     

                                      TABLE IV                                    __________________________________________________________________________    Oligonucleotides Used to Clone Sequences from the human Netrin                      SEQ ID             SEQ ID                   clone                       Method.sup.a                                                                        NO: Oligonucleotide 1.sup.b                                                                      NO: Oligonucleotide 2.sup.c                                                                      clone name.sup.d                                                                    size.sup.e                  __________________________________________________________________________    1.sup.o RT-PCR                                                                      59  GCCTGTCATCGCTCTAG                                                                            60  CAGTCGCAGGCCCTGCA                                2.sup.o PCR                                                                         61  GAGGACGCGCCAACATC                                                                            62  CGGCAGTAGTGGCAGTG                                                                            1121-1123                                                                           1264 bp                     1.sup.o RT-PCR                                                                      63  CCTGCCTCGCTTGCTCCTGC                                                                         64  CGGGCAGCCGCAGGCCGCAT                             2.sup.o PCR                                                                         65  CCTGCAACGGCCATGCCCGC                                                                         66  GCATCCCCGGCGGGCACCCA                                                                         1131-1141                                                                            601 bp                     1.sup.o RT-PCR                                                                      80  CTTGCAGGGCCTGCGAC                                                                            81  GAAGGCACAGGGTGAAC                                2.sup.o PCR                                                                         82  CTGCAACCAGACCACAG                                                                            83  TAGATGTGGGAGCAGCG                                                                            1125-1127                                                                            629 bp                     __________________________________________________________________________     .sup.a Method used to clone sequences. For 2.sup.o PCR, the 1.sup.o RTPCR     product was diluted to a final concentration of one to one thousand.          .sup.b Sequence of sensestrand oligonucleotides.                              .sup.c Sequence of antisensestrand oligonucleotides                           .sup.d Assigned name of the isolated cDNA clones.                             .sup.e Size of clone obtained using the primer pair.                     

Exon traps from the hABC3 transporter encoded by transcript F encodesequences with homology to the R-domain of the murine ABC1 and ABC2genes. The R-domain is believed to play a regulatory role based on thecomparison to a conserved region in CFTR. To date, only ABC1, ABC2 andCFTR have been shown to contain an R-domain (Luciani et al., supra.1994).

Additionally, a 1.1 kb RT-PCR product which links the three exon trapsfrom transcript F, with the RT-PCR product detecting a 7 kb message onNorthern blots has been obtained. Based on a search of the dbESTdatabase, a cDNA from this region was obtained with sequences from exontraps L75924 and L75925 being contained in cDNA 49233 from theI.M.A.G.E. Consortium (Lennon et al., supra.). The presence of bothcloned reagents in the same transcription unit has been confirmed usingRT-PCR.

The ATP binding cassette (ABC) transporters, or traffic ATPs, comprise afamily of more than 100 proteins responsible for the transport of a widevariety of substrates across cell membranes in both prokaryotic andeukaryotic cells (Higgins, C. F., Annu. Rev. Cell. Biol. 8:67-113, 1992;Higgins, C. F. Cell 82:693-696, 1995). Proteins belonging to the ABCtransporter superfamily are linked by strong structural similarities.Typically ABC transporters have four conserved domains, two hydrophobicdomains which may impart substrate specificity (Payne et al., Mol. Gen.Genet. 200:493-496, 1985; Foote et al., Nature 345:255-258, 1990;Anderson et al., Science 253:202-205, 1991; Shustik et al., Br. J.Haematol. 79:50-56, 1991; Covitz et al., EMBO J. 13:1752-1759, 1994),and two highly conserved domains associated with ATP binding andhydrolysis (Higgins, supra. 1992). ABC transporters governunidirectional transport of molecules into or out of cells and acrosssubcellular membranes (Higgins, supra. 1992). Their substrates rangefrom heavy metals (Ouellette et al., Res. Microbiol. 142:737-746 1991)to peptides and full size proteins (Gartner et al., Nature Genet.1:16-23 1992).

In eukaryotic cells, ABC transporters exist either as single largesymmetrical proteins containing all four domains or as dimers resultingfrom the association of two smaller polypeptides each containing ahydrophobic and ATP-binding domain. Examples of this multimericstructural form are human TAP proteins (Kelly et al., Nature 355:641-6441992) and the functional PMP70 protein (Kamijo et al., J. Biol. Chem.265:4534-40 1990). This multimeric structure is also found in numerousprokaryotic ABC transporters. The hydrophobic regions are comprised ofup to six transmembrane spanning segments. Each ATP binding domainoperates independently and may or may not be functionally equivalent(Kerem et al., Science 245:1073-80 1989; Mimmack et al., Proc. Natl.Acad. Sci., USA 86:8257-61 1989; Cutting et al., Nature 346:366-3691990; Kerppola et al., J. Biol. Chem. 266:9857-65 1991).

Several of the ABC transporters thus far identified in humans have beenshown to be clinically important. For example, overexpression ofP-glycoproteins is responsible for multi-drug resistance in tumors(Gottesman et al., Ann. Rev. Biochem. 62:385-427 1993). Classical cysticfibrosis (CF) as well as a large proportion of cases of bilateralcongenital disease of the vas deferens (CBAVD) are caused by mutationsin the cystic fibrosis transmembrane conductance regulator (CFTR), anABC transporter (Kerem et al., supra.; Cutting et al., supra.). Defectsin ABC transporters have also been implicated in Zellweger syndrome(Gartner et al., supra.), and adrenoleukodystrophy (Mosser et al.,Nature 361:726-730 1993).

Two members of a novel ABC transporter subgroup (murine ABC1 and ABC2 )have been shown to contain domains similar to the regulatory R-domain ofCFTR (Luciani et al., supra. 1994). Functionally, the mouse ABC1 proteinhas been shown to play a role in macrophage engulfment of apoptoticcells (Luciani et al., EMBO J. 16:226-235, 1996), while the function ofABC2 remains unknown. All three proteins contain a large charged regioncontaining several potential phosphorylation sites (Kerem et al.,supra.; Luciani et al., supra. 1994). The charged amino acid residueswithin this region are sequentially arranged in blocks of alternatingpositive and negative charge.

A common feature of these particular ABC transporters, including hABC3,is the presence of a large linker domain between the two ATP bindingcassettes. The presence of numerous polar residues and potentialphosphorylation sites in the linker domain suggest that this region mayplay a regulatory role perhaps similar to that of the R-domain of CFTR(Kerem et al., supra.). In addition, the four proteins also contain ahydrophobic region, the HH1 domain (Luciani et al., supra. 1994), withinthe conserved linker domain. Although there is little homology at thesequence level between the HH1 domains of hABC3 and the murine ABCs,they appear to be structurally conserved with each domain predicted tohave β-sheet conformation. The similarity between these proteins wouldsuggest that they all belong to the same ABC subfamily, originallydefined by ABC1 and ABC2 (Luciani et al., supra. 1994). The genesencoding the human homologues of ABC1 and ABC2 have been mapped to humanchromosome 9 at q22-q31 and q34, respectively (Luciani et al., supra.1994).

Despite being members of the same subfamily, it is likely that ABC1,ABC2 and hABC3 have different functional roles. The differences presentin the transmembrane and linker domains of ABC1, ABC2 and hABC3 mayconfer each with a unique substrate specificity. For example,alterations and mutations in the transmembrane domains of bothprokaryotic and eukaryotic ABC transporters have been shown to altersubstrate specificity (Payne et al., supra.; Foote et al., supra.;Covitz et al., supra.) while changes to the R-domain of CFTR have beenshown to alter its ion selectivity (Anderson et al., supra.; Rich etal., Science 253:205-207 1991). The differences in the expressionpatterns of ABC1, ABC2 and hABC3 also suggest that the proteins may befunctionally distinct. Murine ABC1 and ABC2 have been shown to beexpressed at varying levels in a wide variety of adult and embryonictissues, with the highest levels of ABC1 expression being seen inpregnant uterus and regions rich in monocytic cells while highest levelsof ABC2 expression were seen in brain (Luciani et al., supra. 1994;Luciani et al., supra. 1996). In contrast, hABC3 is preferentiallyexpressed in lung with significantly lower levels of expression beingseen in brain, heart, and pancreas.

Apart from the structural differences between ABC1, ABC2 and hABC3, itis always possible that the three proteins play similar functional rolesin different cell populations. To date, no function has been proposedfor murine ABC2. However, recent data indicate that ABC1 is required forthe engulfment of cells undergoing apoptosis, though the molecularmechanism underlying ABC1 function is unknown (Luciani et al., supra.1996). If hABC3 functions in a manner similar to ABC1, it could beexpressed by pulmonary macrophages involved in host defense.

ABC transporters have been described for substrates ranging from smallions to large polysaccharides and proteins. Based on the high level ofexpression in lung, the substrate for hABC3 may play an integral role inthe lung function, including ion or polysaccharide transport. Furtherclues may be provided by a closer examination of hABC3 expression in thelung. These studies would include the identification of the lung cellsresponsible for hABC3 expression as well as determining the subcellularlocalization of hABC3. The identification and cloning of the hABC3 cDNAmay have implications for cystic fibrosis, since it contains a potentialR-domain and is expressed at highest levels in the lung. If hABC3 doesplay an integral role in lung function, then modulation or alteration ofhABC3 substrate specificity could have significant therapeuticimplications for CF.

Several cDNAs were cloned using the GeneTrapper direct selection systemand oligos designed from the 5' most trapped exon encoding sequenceswith homology to ABC1 (trapped exon L48747). The longest clone isolatedwith the GeneTrapper system from a normal human lung cDNA library usingcustom oligonucleotides designed from the 5' most exon trap was 5719 bpin length (ABCgt.1). An additional cDNA clone (ABC.5) was isolated usinga radiolabeled 1.1 kb RT-PCR product (ABC3-12) as a probe (FIG. 15). The5' end of the ABC3 cDNA was further characterized using 5' RACE, withseveral RACE products containing multiple in-frame stop codons upstreamof the start methionine.

Accordingly, the present invention provides a novel human ABC gene whichhas homology to the murine ABC1 and ABC2 genes, as well as sequencespredicted to be encoded by cosmid C48B4.4 from C. elegans (Wilson etal., supra.). A 6.4 kb cDNA has been assembled for the hABC3transporter. The assembled cDNA contains a 5116 nucleotide long openreading frame encoding 1705 amino acids, with the predicted proteinhaving a molecular weight of 191 kDa. The proposed start methionine is50 bp upstream of the 5' end of clone ABCgt.1.

Five trapped exons from P1 clones 109.8C and 47.2H were shown to containsequences with homology to the human ribosomal protein L3 cDNA, withhybridization studies indicating that the L3-like gene is orientedcentromeric to telomeric (transcript L in FIG. 1). The ribosomal L3 geneproduct is one of five essential proteins for peptidyltransferaseactivity in the large ribosomal subunit (Schulze and Nierhaus, EMBO J.1:609-613, 1982). Not surprisingly, the L3 amino acid sequence is highlyconserved across species. Mammalian L3 genes showing ˜98% proteinsequence identity have been characterized from man (Genbank AccessionNo. X73460), mouse (Peckham et al., Genes Dev. 3:2062-2071, 1989), rat(Kuwano and Wool, Biochem. Biophys. Res. Comm. 187:58-64, 1992) and cow(Simonic et al., Biochim. Biophys. Acta 1219:706-710, 1994). Thecumulative percent identity between the trapped exons and the reportedhuman ribosomal protein L3 cDNA was 74% (537/724) at the nucleotidelevel.

A full-length cDNA encoding a novel ribosomal L3 protein subtype, SEML3, was isolated and sequenced (FIG. 11). This gene is now designatedRPL3L and has been assigned GenBank Accession No. U65581. The deducedprotein sequence is 407 amino acids long and shows 77% identity to otherknown mammalian L3 proteins, which are themselves highly conserved.Hybridization analysis of human genomic DNA suggests this novel gene issingle copy and has a tissue specific pattern of expression.

The expression pattern of the previously identified human L3 gene andthe novel human RPL3L was determined using multiple tissue Northernblots. The human L3 gene showed a ubiquitous pattern of expression inall tissues with the highest expression in the pancreas. In contrast,the novel gene described herein is strongly expressed in skeletal muscleand heart tissue, with low levels of expression in the pancreas. Thisnovel gene, RPL3L (Ribosomal Protein L3-Like), is located in a gene-richregion near the PKD1 and TSC2 genes on chromosome 16p13.3.

The RPL3L protein is more closely related to the above mentionedcytoplasmic ribosomal proteins than to previously describednucleus-encoded mitochondrial proteins (Graack et al., Eur. J. Biochem.206:373-380, 1992). The presence of a highly conserved nuclearlocalization sequence in the RPL3L further supports the hypothesis thatit represents a novel cytoplasmic L3ribosomal protein subtype and not anucleus-encoded mitochondrial protein.

In addition, an exon trap (Genbank Accession No. L48792) from a genewhich is located telomeric of the L3-like gene was obtained (transcriptM in FIG. 1). Sequences encoded by transcript M were shown to havehomology to pilB from Neisseria gonorrhoeae (Taha et al., EMBO J.7:4367-4378, 1988) as well as to a computer predicted 17.2 kDa proteinencoded by cosmid F44E2.6 from C. elegans (Wilson et al., supra.).

Using sequences from exon trap L48792, a 600 bp partial cDNA wasisolated and it was determined that the corresponding gene is orientedcentromeric to telomeric. A 1.3 kb message was detected by the cDNA onNorthern blots. Sequences conserved between the partial cDNA and thehypothetical 17.2 kDa protein were also conserved in the pilB proteinfrom Neisseria gonorrhoeae (Taha et al., supra. 1988), a hypothetical19.3 kDa protein from yeast (Genbank Accession No. P25566), and afimbrial transcription regulation repressor from Haemophilus(Fleischmann et al., Science 269:496-512 1995) (FIG. 2). The pilBprotein has homology to histidine kinase sensors and has been shown toplay a role in the repression of pilin production in Neisseriagonorrhoeae (Taha et al., supra. 1988; Taha et al., Mol. Microbiol.5:137-148, 1991). However, residues conserved between pilB, transcript Mand the C. elegans, yeast, and Haemophilus sequences do not include theconserved histidine kinase domains from pilB (Taha et al., supra. 1991).These findings suggest that the conserved region in transcript M has afunction which is independent of the proposed histidine kinase sensoractivity of pilB.

An additional exon trap from region of overlap between the 109.8C and47.2H P1 clones was shown to contain human LLRep3 sequences (Slynn etal., Nuc. Acids Res. 18:681, 1990). Hybridization studies indicated thatthe LLRep3 sequences (transcript K in FIG. 1) were located between thesazD and L3-like genes. The region of highest gene density appears to beat the telomeric end of this cloned interval, particularly the regionbetween TSC2 and D16S84, with a minimum of five genes mapping to thisregion (transcription units K, L and M, sazD and hERV1).

Also mapped to this region, was an exon trap which is 86% identical(170/197) at the nucleotide level to the previously described rataugmenter of liver regeneration (Hagiya et al., Proc. Natl. Acad. Sci.,USA 91:8142-8146, 1994). ALR is a growth factor which augments thegrowth of damaged liver tissue while having no effect on the restingliver. Studies have demonstrated that rat ALR is capable of augmentinghepatocytic regeneration following hepatectomy.

This ALR-like exon trap was also shown to contain sequences from therecently described hERV1 gene, which encodes a functional homologue toyeast ERV1 (Lisowsky et al., supra.).

A 468 bp cDNA, hALR, has been obtained from the human ALR gene (FIG.13). The ALR sequences encode a 119 amino acid protein which is 84.8%identical and 94.1% similar to the rat ALR protein (FIG. 14).

The cloning of human ALR has significant implications in the treatmentof degenerative liver diseases. For example, biologically active rat ALRhas been produced from COS-7 cells expressing rat ALR cDNA (Hagiya etal., supra.). Accordingly, recombinant hALR could be used in thetreatment of damaged liver. In addition, a construct expressing hALRcould be used in gene therapy to treat chronic liver diseases.

Forty three of the trapped exons did not have significant homology tosequences in the protein or DNA databases, nor were ESTs (expressedsequence tags) containing sequences from the exon traps observed indbEST. The absence of ESTs containing sequences from these novel exontraps is not surprising since one of the criterion for selecting exontraps for further analysis was the presence of an EST in the database.These trapped exons are likely to represent bona fide products, since inmany cases they were trapped multiple times from different P1 clones andin combination with flanking exons.

The present invention encompasses novel human genes an isolated nucleicacids comprising unique exon sequences from chromosome 16. The sequencesdescribed herein provide a valuable resource for transcriptional mappingand create a set of sequence-ready templates for a gene-rich intervalresponsible for at least two inheritable diseases.

Accordingly, the present invention provides isolated nucleic acidsencoding human netrin (hNET), human ATP Binding Cassette transporter(hABC3), human ribosomal L3 (RPL3L) and human augmenter of liverregeneration (hALR) polypeptides. The present invention further providesisolated nucleic acids comprising unique exon sequences from chromosome16. The term "nucleic acids" (also referred to as polynucleotides)encompasses RNA as well as single and double-stranded DNA, cDNA andoligonucleotides. As used herein, the phrase "isolated" means apolynucleotide that is in a form that does not occur in nature.

One means of isolating polynucleotides encoding invention polypeptidesis to probe a human tissue-specific library with a natural orartificially designed DNA probe using methods well known in the art. DNAprobes derived from the human netrin gene, hNET, the human ABCtransporter gene, hABC3, the human ribosomal protein L3 gene, RPL3L, orthe human augmenter of liver regeneration gene, hALR, are particularlyuseful for this purpose. DNA and cDNA molecules that encode inventionpolypeptides can be used to obtain complementary genomic DNA, cDNA orRNA from human, mammalian, or other animal sources, or to isolaterelated cDNA or genomic clones by the screening of cDNA or genomiclibraries, by methods described in more detail below.

The present invention encompasses isolated nucleic acid sequences,including sense and antisense oligonucleotide sequences, derived fromthe sequences shown in FIGS. 3, 4, 8, 11 and 15. hNET-, hABC3-, RPL3L-(SEM L3-), and hALR-derived sequences may also be associated withheterologous sequences, including promoters, enhancers, responseelements, signal sequences, polyadenylation sequences, and the like.Furthermore, the nucleic acids can be modified to alter stability,solubility, binding affinity, and specificity. For example,invention-derived sequences can further include nuclease-resistantphosphorothioate, phosphoroamidate, and methylphosphonate derivatives,as well as "protein nucleic acid" (PNA) formed by conjugating bases toan amino acid backbone as described in Nielsen et al., Science,254:1497, 1991. The nucleic acid may be derivatized by linkage of theα-anomer nucleotide, or by formation of a methyl or ethylphosphotriester or an alkyl phosphoramidate linkage. Furthermore, thenucleic acid sequences of the present invention may also be modifiedwith a label capable of providing a detectable signal, either directlyor indirectly. Exemplary labels include radioisotopes, fluorescentmolecules, biotin, and the like.

In general, nucleic acid manipulations according to the presentinvention use methods that are well known in the art, as disclosed in,for example, Sambrook et al., Molecular Cloning, A Laboratory Manual 2dEd. (Cold Spring Harbor, N.Y., 1989), or Ausubel et al., CurrentProtocols in Molecular Biology (Greene Assoc., Wiley Interscience, NewYork, N.Y., 1992).

Examples of nucleic acids are RNA, cDNA, or genomic DNA encoding a humannetrin, a human ABC transporter, a human ribosomal L3 subtype, or ahuman augmenter of liver regeneration polypeptide. Such nucleic acidsmay have coding sequences substantially the same as the coding sequenceshown in FIGS. 3, 4, 8, 11 and 15, respectively.

The present invention further provides isolated oligonucleotidescorresponding to sequences within the hNET, hABC3, RPL3L (formerly SEML3), hALR genes, or within the respective cDNAs, which, alone ortogether, can be used to discriminate between the authentic expressedgene and homologues or other repeated sequences. These oligonucleotidesmay be from about 12 to about 60 nucleotides in length, preferably about18 nucleotides, may be single- or double-stranded, and may be labeled ormodified as described below.

This invention also encompasses nucleic acids which differ from thenucleic acids shown in FIGS. 3, 4, 8, 11 and 15, but which have the samephenotype, i.e., encode substantially the same amino acid sequence setforth in FIGS. 3, 4, 8, 11 and 15, respectively. Phenotypically similarnucleic acids are also referred to as "functionally equivalent nucleicacids". As used herein, the phrase "functionally equivalent nucleicacids" encompasses nucleic acids characterized by slight andnon-consequential sequence variations that will function insubstantially the same manner to produce the same protein product(s) asthe nucleic acids disclosed herein. In particular, functionallyequivalent nucleic acids encode proteins that are the same as thosedisclosed herein or that have conservative amino acid variations. Forexample, conservative variations include substitution of a non-polarresidue with another non-polar residue, or substitution of a chargedresidue with a similarly charged residue. These variations include thoserecognized by skilled artisans as those that do not substantially alterthe tertiary structure of the protein.

Further provided are nucleic acids encoding human netrin, human ABC3transporter, human ribosomal L3 subtype, and human augmenter of liverregeneration polypeptides that, by virtue of the degeneracy of thegenetic code, do not necessarily hybridize to the invention nucleicacids under specified hybridization conditions. Preferred nucleic acidsencoding the invention polypeptide are comprised of nucleotides thatencode substantially the same amino acid sequence set forth in FIGS. 4,8, 11 and 15. Alternatively, preferred nucleic acids encoding theinvention polypeptide(s) hybridize under high stringency conditions tosubstantially the entire sequence, or substantial portions (i.e.,typically at least 12 to 60 nucleotides) of the nucleic acid sequenceset forth in FIGS. 3, 4, 8, 11 and 15, respectively.

Stringency of hybridization, as used herein, refers to conditions underwhich polynucleotide hybrids are stable. As known to those of skill inthe art, the stability of hybrids is a function of sodium ionconcentration and temperature. (See, for example, Sambrook et al.,supra.).

The present invention provides isolated polynucleotides operativelylinked to a promoter of RNA transcription, as well as other regulatorysequences. As used herein, the phrase "operatively linked" refers to thefunctional relationship of the polynucleotide with regulatory andeffector sequences of nucleotides, such as promoters, enhancers,transcriptional and translational stop sites, and other signalsequences. For example, operative linkage of a polynucleotide to apromoter refers to the physical and functional relationship between thepolynucleotide and the promoter such that transcription of DNA isinitiated from the promoter by an RNA polymerase that specificallyrecognizes and binds to the promoter, and wherein the promoter directsthe transcription of RNA from the polynucleotide.

Promoter regions include specific sequences that are sufficient for RNApolymerase recognition, binding and transcription initiation.Additionally, promoter regions include sequences that modulate therecognition, binding and transcription initiation activity of RNApolymerase. Such sequences may be cis acting or may be responsive totrans acting factors. Depending upon the nature of the regulation,promoters may be constitutive or regulated. Examples of promoters areSP6, T4, T7, SV40 early promoter, cytomegalovirus (CMV) promoter, mousemammary tumor virus (MMTV) steroid-inducible promoter, Moloney murineleukemia virus (MMLV) promoter, and the like.

Vectors that contain both a promoter and a cloning site into which apolynucleotide can be operatively linked are well known in the art. Suchvectors are capable of transcribing RNA in vitro or in vivo, and arecommercially available from sources such as Stratagene (La Jolla,Calif.) and Promega Biotech (Madison, Wis.). In order to optimizeexpression and/or in vitro transcription, it may be necessary to remove,add or alter 5' and/or 3' untranslated portions of the clones toeliminate extra, potential inappropriate alternative translationinitiation codons or other sequences that may interfere with or reduceexpression, either at the level of transcription or translation.Alternatively, consensus ribosome binding sites can be insertedimmediately 5' of the start codon to enhance expression. Similarly,alternative codons, encoding the same amino acid, can be substituted forcoding sequences of the human netrin, human ABC3 transporter, the humanribosomal L3 subtype, or the human augmenter of liver regenerationpolypeptide in order to enhance transcription (e.g., the codonpreference of the host cell can be adopted, the presence of G-C richdomains can be reduced, and the like).

Examples of vectors are viruses, such as baculoviruses and retroviruses,bacteriophages, cosmids, plasmids, fungal vectors and otherrecombination vehicles typically used in the art which have beendescribed for expression in a variety of eukaryotic and prokaryotichosts, and may be used for gene therapy as well as for simple proteinexpression.

Polynucleotides are inserted into vector genomes using methods wellknown in the art. For example, insert and vector DNA can be contacted,under suitable conditions, with a restriction enzyme to createcomplementary ends on each molecule that can pair with each other and bejoined together with a ligase. Alternatively, synthetic nucleic acidlinkers can be ligated to the termini of restricted polynucleotide.These synthetic linkers contain nucleic acid sequences that correspondto a particular restriction site in the vector DNA. Additionally, anoligonucleotide containing a termination codon and an appropriaterestriction site can be ligated for insertion into a vector containing,for example, some or all of the following:a selectable marker gene, suchas the neomycin gene for selection of stable or transient transfectantsin mammalian cells; enhancer/promoter sequences from the immediate earlygene of human CMV for high levels of transcription; transcriptiontermination and RNA processing signals from SV40 for mRNA stability;SV40 polyoma origins of replication and ColE1 for proper episomalreplication; versatile multiple cloning sites; and T7 and SP6 RNApromoters for in vitro transcription of sense and antisense RNA. Othermeans are well known and available in the art.

Also provided are vectors comprising a polynucleotide encoding humannetrin, human ABC3 transporter, human ribosomal L3 subtype, and humanaugmenter of liver regeneration polypeptides, adapted for expression ina bacterial cell, a yeast cell, an amphibian cell, an insect cell, amammalian cell and other animal cells. The vectors additionally comprisethe regulatory elements necessary for expression of the polynucleotidein the bacterial, yeast, amphibian, mammalian or animal cells so locatedrelative to the polynucleotide encoding human netrin, human ABC3transporter, human ribosomal L3 subtype, or human augmenter of liverregeneration polypeptides as to permit expression thereof. As usedherein, "expression" refers to the process by which polynucleotides aretranscribed into mRNA and translated into peptides, polypeptides, orproteins. If the polynucleotide is derived from genomic DNA, expressionmay include splicing of the mRNA, if an appropriate eukaryotic host isselected. Regulatory elements required for expression include promotersequences to bind RNA polymerase and transcription initiation sequencesfor ribosome binding. For example, a bacterial expression vectorincludes a promoter such as the lac promoter and for transcriptioninitiation the Shine-Dalgarno sequence and the start codon AUG (Sambrooket al., supra.). Similarly, a eukaryotic expression vector includes aheterologous or homologous promoter for RNA polymerase II, a downstreampolyadenylation signal, the start codon AUG, and a termination codon fordetachment of the ribosome. Such vectors can be obtained commercially orassembled by the sequences described in methods well known in the art,for example, the methods described above for constructing vectors ingeneral. Expression vectors are useful to produce cells that express theinvention receptor.

This invention provides a transformed host cell that recombinantlyexpresses the human netrin, human ABC3 transporter, human ribosomal L3subtype, or human augmenter of liver regeneration polypeptides.Invention host cells have been transformed with a polynucleotideencoding a human netrin, a human ABC3 transporter, a human ribosomal L3subtype, or a human augmenter of liver regeneration polypeptide. Anexample is a mammalian cell comprising a plasmid adapted for expressionin a mammalian cell. The plasmid contains a polynucleotide encodinghuman netrin, human ABC3 transporter, human ribosomal L3 subtype, orhuman augmenter of liver regeneration polypeptide and the regulatoryelements necessary for expression of the invention protein.

Appropriate host cells include bacteria, archebacteria, fungi,especially yeast, plant cells, insect cells and animal cells, especiallymammalian cells. of particular interest are E. coli, B. subtilis,Saccharomyces cerevisiae, SF9 cells, C129 cells, 293 cells, Neurospora,and CHO cells, COS cells, HeLa cells, and immortalized mammalian myeloidand lymphoid cell lines. Preferred replication systems include M13,ColE1, SV40, baculovirus, lambda, adenovirus, artificial chromosomes,and the like. A large number of transcription initiation and terminationregulatory regions have been isolated and shown to be effective in thetranscription and translation of heterologous proteins in the varioushosts. Examples of these regions, methods of isolation, manner ofmanipulation, and the like, are known in the art. Under appropriateexpression conditions, host cells can be used as a source ofrecombinantly produced hNET, hABC3, RPL3L (formerly SEM L3) and/or hALR.

Nucleic acids (polynucleotides) encoding invention polypeptides may alsobe incorporated into the genome of recipient cells by recombinationevents. For example, such a sequence can be microinjected into a cell,and thereby effect homologous recombination at the site of an endogenousgene encoding hNET, hABC3, RPL3L (formerly SEM L3), and/or hALR ananalog or pseudogene thereof, or a sequence with substantial identity toa hNET-, hABC3-, RPL3L (SEM L3-), or hALR- encoding gene. otherrecombination-based methods such as nonhomologous recombinations ordeletion of endogenous gene by homologous recombination, especially inpluripotent cells, may also be used.

The present invention provides isolated peptides, polypeptides(s) and/orprotein(s) encoded by the invention nucleic acids. The present inventionalso encompasses isolated polypeptides having a sequence encoded byhNET, hABC3, RPL3L (SEM L3), and hALR genes, as well as peptides of sixor more amino acids derived therefrom. The polypeptide(s) may beisolated from human tissues obtained by biopsy or autopsy, or may beproduced in a heterologous cell by recombinant DNA methods as describedherein.

As used herein, the term "isolated" means a protein molecule free ofcellular components and/or contaminants normally associated with anative in vivo environment. Invention polypeptides and/or proteinsinclude any natural occurring allelic variant, as well as recombinantforms thereof. Invention polypeptides can be isolated using variousmethods well known to a person of skill in the art.

The methods available for the isolation and purification of inventionproteins include, precipitation, gel filtration, and chromatographicmethods including molecular sieve, ion-exchange, and affinitychromatography using e.g. hNET-, hABC3-, RPL3L- (SEM L3-), and/orhALR-specific antibodies or ligands. Other well-known methods aredescribed in Deutscher et al., Guide to Protein Purification: Methods inEnzymology Vol. 182, (Academic Press, 1990). When the inventionpolypeptide to be purified is produced in a recombinant system, therecombinant expression vector may comprise additional sequences thatencode additional amino-terminal or carboxy-terminal amino acids; theseextra amino acids act as "tags" for immunoaffinity purification usingimmobilized antibodies or for affinity purification using immobilizedligands.

Peptides comprising hNET-, hABC3-, RPL3L- (SEM L3-) or hALR-specificsequences may be derived from isolated larger hNET, hABC3, RPL3L (SEML3), or hALR polypeptides described above, using proteolytic cleavagesby e.g. proteases such as trypsin and chemical treatments such ascyanogen bromide that are well-known in the art. Alternatively, peptidesup to 60 residues in length can be routinely synthesized in milligramquantities using commercially available peptide synthesizers.

An example of the means for preparing the invention polypeptide(s) is toexpress polynucleotides encoding hNET, hABC3, RPL3L (SEM L3), and/orhALR in a suitable host cell, such as a bacterial cell, a yeast cell, anamphibian cell (i.e., oocyte), an insect cell (i.e., drosophila) or amammalian cell, using methods well known in the art, and recovering theexpressed polypeptide, again using well-known methods. Inventionpolypeptides can be isolated directly from cells that have beentransformed with expression vectors, described below in more detail. Theinvention polypeptide, biologically active fragments, and functionalequivalents thereof can also be produced by chemical synthesis. As usedherein, "biologically active fragment" refers to any portion of thepolypeptide represented by the amino acid sequence in FIGS. 4, 8, 11 and15 that can assemble into an active protein. Synthetic polypeptides canbe produced using Applied Biosystems, Inc. Model 430A or 431A automaticpeptide synthesizer (Foster City, Calif.) employing the chemistryprovided by the manufacturer.

Modification of the invention nucleic acids, polynucleotides,polypeptides, peptides or proteins with the following phrases:"recombinantly expressed/produced", "isolated", or "substantially pure",encompasses nucleic acids, polynucleotides, polypeptides, peptides orproteins that have been produced in such form by the hand of man, andare thus separated from their native in vivo cellular environment. As aresult of this human intervention, the recombinant nucleic acids,polynucleotides, polypeptides, peptides and proteins of the inventionare useful in ways that the corresponding naturally occurring moleculesare not, such as identification of selective drugs or compounds.

Sequences having "substantial sequence homology" are intended to referto nucleotide sequences that share at least about 90% identity withinvention nucleic acids; and amino acid sequences that typically shareat least about 95% amino acid identity with invention polypeptides. Itis recognized, however, that polypeptides or nucleic acids containingless than the above-described levels of homology arising as splicevariants or that are modified by conservative amino acid substitutions,or by substitution of degenerate codons are also encompassed within thescope of the present invention.

The present invention provides a nucleic acid probe comprising apolynucleotide capable of specifically hybridizing with a sequenceincluded within the nucleic acid sequence encoding human netrin, humanABC3 transporter, human ribosomal L3 subtype, or human augmenter ofliver regeneration polypeptide, for example, a coding sequence includedwithin the nucleotide sequence shown in FIGS. 3, 4, 8, 11 and 15,respectively.

As used herein, a "nucleic acid probe" may be a sequence of nucleotidesthat includes from about 12 to about 60 contiguous bases set forth inFIGS. 3, 4, 8, 11 and 15, preferably about 18 nucleotides, may besingle- or double-stranded, and may be labeled or modified as describedherein. Preferred regions from which to construct probes include 5'and/or 3' coding sequences, sequences predicted to encode transmembranedomains, sequences predicted to encode cytoplasmic loops, signalsequences, ligand binding sites, and the like.

Full-length or fragments of cDNA clones can also be used as probes forthe detection and isolation of related genes. When fragments are used asprobes, preferably the cDNA sequences will be from the carboxylend-encoding portion of the cDNA, and most preferably will includepredicted transmembrane domain-encoding portions of the cDNA sequence.Transmembrane domain regions can be predicted based on hydropathyanalysis of the deduced amino acid sequence using, for example, themethod of Kyte and Doolittle (J. Mol. Biol. 157:105, 1982).

As used herein, the phrase "specifically hybridizing" encompasses theability of a polynucleotide to recognize a sequence of nucleic acidsthat are complementary thereto and to form double-helical segments viahydrogen bonding between complementary base pairs. Nucleic acid probetechnology is well known to those skilled in the art who will readilyappreciate that such probes may vary greatly in length and may belabeled with a detectable agent, such as a radioisotope, a fluorescentdye, and the like, to facilitate detection of the probe. Inventionprobes are useful to detect the presence of nucleic acids encoding humannetrin, human ABC3 transporter, human ribosomal L3 subtype, or humanaugmenter of liver regeneration polypeptides. For example, the probescan be used for in situ hybridizations in order to locate biologicaltissues in which the invention gene is expressed. Additionally,synthesized oligonucleotides complementary to the nucleic acids of apolynucleotide encoding human netrin, human ABC3 transporter, humanribosomal L3 subtype, or human augmenter of liver regenerationpolypeptides are useful as probes for detecting the invention genes,their associated mRNA, or for the isolation of related genes usinghomology screening of genomic or cDNA libraries, or by usingamplification techniques well known to one of skill in the art.

Also provided are antisense oligonucleotides having a sequence capableof binding specifically with any portion of an mRNA that encodes humannetrin, human ABC3 transporter, human ribosomal L3 subtype, or humanaugmenter of liver regeneration polypeptide so as to prevent translationof the mRNA. The antisense oligonucleotide may have a sequence capableof binding specifically with any portion of the sequence of the cDNAencoding human netrin, human ABC3 transporter, human ribosomal L3subtype, or human augmenter of liver regeneration polypeptide. As usedherein, the phrase "binding specifically" encompasses the ability of anucleic acid sequence to recognize a complementary nucleic acid sequenceand to form double-helical segments therewith via the formation ofhydrogen bonds between the complementary base pairs. An example of anantisense oligonucleotide is an antisense oligonucleotide comprisingchemical analogs of nucleotides (i.e., synthetic antisenseoligonucleotide, SAO).

Compositions comprising an amount of the antisense oligonucleotide,(SAOC), effective to reduce expression of the human netrin, the humanABC3 transporter, the human ribosomal L3 subtype, or the human augmenterof liver regeneration polypeptide by passing through a cell membrane andbinding specifically with mRNA encoding the human netrin, the human ABC3transporter, the human ribosomal L3 subtype, or the human augmenter ofliver regeneration polypeptide so as to prevent its translation and anacceptable hydrophobic carrier capable of passing through a cellmembrane are also provided herein. The acceptable hydrophobic carriercapable of passing through cell membranes may also comprise a structurewhich binds to a receptor specific for a selected cell type and isthereby taken up by cells of the selected cell type. The structure maybe part of a protein known to bind to a cell-type specific receptor.

This invention provides a means to modulate levels of expression ofinvention polypeptides by the use of a synthetic antisenseoligonucleotide composition (SAOC) which inhibits translation of mRNAencoding these polypeptides. Synthetic oligonucleotides, or otherantisense chemical structures designed to recognize and selectively bindto mRNA, are constructed to be complementary to portions of thenucleotide sequences shown in FIGS. 3, 4, 8, 11 and 15, of DNA, RNA orchemically modified, artificial nucleic acids. The SAOC is designed tobe stable in the blood stream for administration to a subject byinjection, or in laboratory cell culture conditions. The SAOC is designed to be capable of passing through the cell membrane in order toenter the cytoplasm of the cell by virtue of physical and chemicalproperties of the SAOC which render it capable of passing through cellmembranes, for example, by designing small, hydrophobic SAOC chemicalstructures, or by virtue of specific transport systems in the cell whichrecognize and transport the SAOC into the cell.

In addition, the SAOC can be designed for administration only to certainselected cell populations by targeting the SAOC to be recognized byspecific cellular uptake mechanisms which bind and take up the SAOC onlywithin select cell populations. For example, the SAOC may be designed tobind to a receptor found only in a certain cell type, as discussedsupra. The SAOC is also designed to recognize and selectively bind tothe target mRNA sequence, which may correspond to a sequence containedwithin the sequence shown in FIGS. 3, 4, 8, 11 and 15. The SAOC isdesigned to inactivate the target mRNA sequence by either binding to thetarget mRNA and inducing degradation of the mRNA by, for example, RNaseI digestion, or inhibiting translation of the mRNA target by interferingwith the binding of translation-regulating factors or ribosomes, orinclusion of other chemical structures, such as ribozyme sequences orreactive chemical groups which either degrade or chemically modify thetarget mRNA. SAOCs have been shown to be capable of such properties whendirected against mRNA targets (see Cohen et al., TIPS, 10:435, 1989 andWeintraub, Sci. American, January pp.40, 1990).

This invention further provides a composition containing an acceptablecarrier and any of an isolated, purified human netrin, human ABC3transporter, human ribosomal L3 subtype, or human augmenter of liverregeneration polypeptide, an active fragment thereof, or a purified,mature protein and active fragments thereof, alone or in combinationwith each other. These polypeptides or proteins can be recombinantlyderived, chemically synthesized or purified from native sources. As usedherein, the term "acceptable carrier" encompasses any of the standardpharmaceutical carriers, such as phosphate buffered saline solution,water and emulsions such as an oil/water or water/oil emulsion, andvarious types of wetting agents.

Also provided are antibodies having specific reactivity with the humannetrin, the human ABC3 transporter, the human ribosomal L3 subtype, orthe human augmenter of liver regeneration polypeptides of the subjectinvention. Active fragments of antibodies are encompassed within thedefinition of "antibody". Invention antibodies can be produced bymethods known in the art using the invention proteins or portionsthereof as antigens. For example, polyclonal and monoclonal antibodiescan be produced by methods well known in the art, as described, forexample, in Harlow and Lane, Antibodies: A Laboratory Manual (ColdSpring Harbor Laboratory 1988).

The polypeptides of the present invention can be used as the immunogenin generating such antibodies. Alternatively, synthetic peptides can beprepared (using commercially available synthesizers) and used asimmunogens. Where natural or synthetic hNET-, hABC3-, RPL3L- (SEM L3-),and/or hALR-derived peptides are used to induce a hNET-, hABC3-, RPL3L-(SEM L3-), and/or hALR-specific immune response, the peptides may beconveniently coupled to an suitable carrier such as KLH and administeredin a suitable adjuvant such as Freund's. Preferably, selected peptidesare coupled to a lysine core carrier substantially according to themethods of Tam, Proc. Natl. Acad. Sci, USA 85:5409-5413, 1988. Theresulting antibodies may be modified to a monovalent form, such as, forexample, Fab, Fab₂, FAB', or FV. Anti-idiotypic antibodies may also beprepared using known methods.

In one embodiment, normal or mutated hNET, hABC3, RPL3L (SEM L3), orhALR polypeptides are used to immunize mice, after which their spleensare removed, and splenocytes used to form cell hybrids with myelomacells and obtain clones of antibody-secreted cells according totechniques that are standard in the art. The resulting monoclonalantibodies are screened for specific binding to hNET, hABC3, RPL3L (SEML3), and/or hALR proteins or hNET-, hABC3-, RPL3L- (SEM L3-), and/orhALR-related peptides.

In another embodiment, antibodies are screened for selective binding tonormal or mutated hNET, hABC3, RPL3L (SEM L3), or hALR sequences.Antibodies that distinguish between normal and mutant forms of hNET,hABC3, RPL3L (SEM L3), or hALR may be used in diagnostic tests (seebelow) employing ELISA, EMIT, CEDIA, SLIFA, and the like. Anti-hNET,hABC3, RPL3L (SEM L3), or hALR antibodies may also be used to performsubcellular and histochemical localization studies. Finally, antibodiesmay be used to block the function of the hNET, hABC3, RPL3L (SEM L3),and/or hALR polypeptide, whether normal or mutant, or to performrational drug design studies to identify and test inhibitors of thefunction (e.g., using an anti-idiotypic antibody approach).

Amino acid sequences can be analyzed by methods well known in the art todetermine whether they encode hydrophobic or hydrophilic domains of thecorresponding polypeptide. Altered antibodies such as chimeric,humanized, CDR-grafted or bifunctional antibodies can also be producedby methods well known in the art. Such antibodies can also be producedby hybridoma, chemical synthesis or recombinant methods described, forexample, in Sambrook et al., supra., and Harlow and Lane, supra. Bothanti-peptide and anti-fusion protein antibodies can be used. (see, forexample, Bahouth et al., Trends Pharmacol. Sci. 12:338, 1991; Ausubel etal., supra.).

Invention antibodies can be used to isolate invention polypeptides.Additionally, the antibodies are useful for detecting the presence ofthe invention polypeptides, as well as analysis of polypeptidelocalization, composition, and structure of functional domains. Methodsfor detecting the presence of a human netrin, a human ABC3 transporter,a human ribosomal L3subtype, or a human augmenter of liver regenerationpolypeptide comprise contacting the cell with an antibody thatspecifically binds to the polypeptide, under conditions permittingbinding of the antibody to the polypeptide, detecting the presence ofthe antibody bound to the cell, and thereby detecting the presence ofthe invention polypeptide on the cell. With respect to the detection ofsuch polypeptides, the antibodies can be used for in vitro diagnostic orin vivo imaging methods.

Immunological procedures useful for in vitro detection of the targethuman netrin, human ABC3 transporter, human ribosomal L3 subtype, orhuman augmenter of liver regeneration polypeptide in a sample includeimmunoassays that employ a detectable antibody. Such immunoassaysinclude, for example, ELISA, Pandex microfluorimetric assay,agglutination assays, flow cytometry, serum diagnostic assays andimmunohistochemical staining procedures which are well known in the art.An antibody can be made detectable by various means well known in theart. For example, a detectable marker can be directly or indirectlyattached to the antibody. Useful markers include, for example,radionuclides, enzymes, fluorogens, chromogens and chemiluminescentlabels.

For in vivo imaging methods, a detectable antibody can be administeredto a subject and the binding of the antibody to the inventionpolypeptide can be detected by imaging techniques well known in the art.Suitable imaging agents are known and include, for example,gamma-emitting radionuclides such as ¹¹¹ In, ^(99m) Tc, ⁵¹ Cr and thelike, as well as paramagnetic metal ions, which are described in U.S.Pat. No. 4,647,447. The radionuclides permit the imaging of tissues bygamma scintillation photometry, positron emission tomography, singlephoton emission computed tomography and gamma camera whole body imaging,while paramagnetic metal ions permit visualization by magnetic resonanceimaging.

The invention provides a transgenic non-human mammal that is capable ofexpressing nucleic acids encoding a human netrin, a human ABC3transporter, a human ribosomal L3 subtype, or a human augmenter of liverregeneration polypeptide. Also provided is a transgenic non-human mammalcapable of expressing nucleic acids encoding a human netrin, a humanABC3 transporter, a human ribosomal L3 subtype, or a human augmenter ofliver regeneration polypeptide so mutated as to be incapable of normalactivity, i.e., does not express native protein.

The present invention also provides a transgenic non-human mammal havinga genome comprising antisense nucleic acids complementary to nucleicacids encoding human netrin, human ABC3 transporter, human ribosomal L3subtype, or human augmenter of liver regeneration polypeptide so placedas to be transcribed into antisense mRNA complementary to mRNA encodinga human netrin, human ABC3 transporter, human ribosomal L3 subtype, orhuman augmenter of liver regeneration polypeptide, which hybridizesthereto and, thereby, reduces the translation thereof. Thepolynucleotide may additionally comprise an inducible promoter and/ortissue specific regulatory elements, so that expression can be induced,or restricted to specific cell types. Examples of polynucleotides areDNA or cDNA having a coding sequence substantially the same as thecoding sequence shown in FIGS. 3, 4, 8, 11 and 15. Examples of non-humantransgenic mammals are transgenic cows, sheep, goats, pigs, rabbits,rats and mice. Examples of tissue specificity-determining elements arethe metallothionein promoter and the T7 promoter.

Animal model systems which elucidate the physiological and behavioralroles of invention polypeptides are produced by creating transgenicanimals in which the expression of the polypeptide is altered using avariety of techniques. Examples of such techniques include the insertionof normal or mutant versions of nucleic acids encoding human netrin,human ABC3 transporter, human ribosomal L3 subtype, or human augmenterof liver regeneration polypeptide by microinjection, retroviralinfection or other means well known to those skilled in the art, intoappropriate fertilized embryos to produce a transgenic animal. See, forexample, Carver et al., Bio/Technology 11:1263-1270, 1993; Carver etal., Cytotechnology 9:77-84, 1992; Clark et al., Bio/Technology7:487-492, 1989; Simons et al., Bio/Technology 6:179-183, 1988; Swansonet al., Bio/Technology 10:557-559, 1992; Velander et al., Proc. Natl.Acad. Sci., USA 89:12003-12007, 1992; Hammer et al., Nature 315:680-683,1985; Krimpenfort et al., Bio/Technology 9:844-847, 1991; Ebert et al.,Bio/Technology 9:835-838, 1991; Simons et al., Nature 328:530-532, 1987;Pittius et al., Proc. Natl. Acad. Sci., USA 85:5874-5878, 1988;Greenberg et al., Proc. Natl. Acad. Sci., USA 88:8327-8331, 1991;Whitelaw et al., Transg. Res. 1:3-13, 1991; Gordon et al.,Bio/Technology 5:1183-1187, 1987; Grosveld et al., Cell 51:975-985,1987; Brinster et al., Proc. Natl. Acad. Sci., USA 88:478-482, 1991;Brinster et al., Proc. Natl. Acad. Sci., USA 85:836-840, 1988; Brinsteret al., Proc. Natl. Acad. Sci., USA 82:4438-4442, 1985; Al-Shawi et al.,Mol. Cell. Biol. 10(3):1192-1198, 1990; Van Der Putten et al., Proc.Natl. Acad. Sci., USA 82:6148-6152, 1985; Thompson et al., Cell56:313-321, 1989; Gordon et al., Science 214:1244-1246, 1981; and Hoganet al., Manipulating the Mouse Embryo: A Laboratory Manual (Cold SpringHarbor Laboratory, 1986).

Another technique, homologous recombination of mutant or normal versionsof these genes with the native gene locus in transgenic animals, may beused to alter the regulation of expression or the structure of theinvention polypeptides (see, Capecchi et al., Science 244:1288, 1989;Zimmer et al., Nature 338:150, 1989). Homologous recombinationtechniques are well known in the art. Homologous recombination replacesthe native (endogenous) gene with a recombinant or mutated gene toproduce an animal that cannot express native (endogenous) protein butcan express, for example, a mutated protein which results in alteredexpression of the human netrin, human ABC3 transporter, human ribosomalL3 subtype, or human augmenter of liver regeneration polypeptide.

In contrast to homologous recombination, microinjection adds genes tothe host genome, without removing host genes. Microinjection can producea transgenic animal that is capable of expressing both endogenous andexogenous human netrin, human ABC3 transporter, human ribosomal L3subtype, or human augmenter of liver regeneration polypeptides.Inducible promoters can be linked to the coding region of the nucleicacids to provide a means to regulate expression of the transgene.Tissue-specific regulatory elements can be linked to the coding regionto permit tissue-specific expression of the transgene. Transgenic animalmodel systems are useful for in vivo screening of compounds foridentification of ligands, i.e., agonists and antagonists, whichactivate or inhibit polypeptide responses.

The nucleic acids, oligonucleotides (including antisense), vectorscontaining same, transformed host cells, polypeptides, as well asantibodies of the present invention, can be used to screen compounds invitro to determine whether a compound functions as a potential agonistor antagonist to the invention protein. These in vitro screening assaysprovide information regarding the function and activity of the inventionprotein, which can lead to the identification and design of compoundsthat are capable of specific interaction with invention proteins.

In accordance with still another embodiment of the present invention,there is provided a method for identifying compounds which bind to humannetrin, human ABC3 transporter, human ribosomal L3 subtype, or humanaugmenter of liver regeneration polypeptides. The invention proteins maybe employed in a competitive binding assay. Such an assay canaccommodate the rapid screening of a large number of compounds todetermine which compounds, if any, are capable of binding to inventionpolypeptides. Subsequently, more detailed assays can be carried out withthose compounds found to bind, to further determine whether suchcompounds act as modulators, agonists or antagonists of inventionpolypeptides.

In accordance with another embodiment of the present invention,transformed host cells that recombinantly express invention polypeptidescan be contacted with a test compound, and the modulating effect(s)thereof can then be evaluated by comparing the human netrin, human ABC3transporter, human ribosomal L3 subtype, or human augmenter of liverregeneration polypeptide-mediated response in the presence and absenceof test compound, or by comparing the response of test cells or controlcells (i.e., cells that do not express invention polypeptides), to thepresence of the compound.

As used herein, a compound or a signal that "modulates the activity" ofan invention polypeptide refers to a compound or a signal that altersthe activity of the human netrin, the human ABC3 transporter, the humanribosomal L3 subtype, or the human augmenter of liver regenerationpolypeptide so that the activity of the invention polypeptide isdifferent in the presence of the compound or signal than in the absenceof the compound or signal. In particular, such compounds or signalsinclude agonists and antagonists. An agonist encompasses a compound or asignal that activates polypeptide function. Alternatively, an antagonistincludes a compound or signal that interferes with polypeptide function.Typically, the effect of an antagonist is observed as a blocking ofagonist-induced protein activation. Antagonists include competitive andnon-competitive antagonists. A competitive antagonist (or competitiveblocker) interacts with or near the site specific for agonist binding. Anon-competitive antagonist or blocker inactivates the function of thepolypeptide by interacting with a site other than the agonistinteraction site.

The following examples are intended to illustrate the invention withoutlimiting the scope thereof.

EXAMPLE I Contig Assembly

A. Cosmids

Multiple cosmids were used as reagents to initiate walks in YAC and P1libraries. Clones 16-166N (D16S277), 16-191N (D16S279), 16-198N(D16S280) and 16-140N (D16S276) were previously isolated from a cosmidlibrary (Lerner et al., Mamm. Genome 3:92-100, 1992). Cosmids cCMM65(D16S84), c291 (D16S291), cAJ42 (ATP6C) and cKG8 were recovered fromtotal human cosmid libraries (made in-house or by Stratagene, La Jolla,Calif.) using either a cloned insert (CMM65) or sequence-specificoligonucleotides as probe. The c326 cosmid contig and clone 413C12originated from a flow-sorted chromosome 16 library (Stallings et al.,Genomics 13(4):1031-1039, 1992). The c326 contig was comprised of clones2H2, 77E8, 325A11 and 325B10.

B. YACs

Screening of gridded interspersed-repetitive sequence (IRS pools fromMark I, Mark II and Mega-YAC libraries) with cosmid-specific IRS probeswas as previously described (Liu et al., Genomics 26:178-191, 1995). IRSprobes were made from cosmids 16-166N, 16-191N, cAJ42, 16-198N, 325A11,cCMM65, and 16-140N. Biotinylated YAC probes were generated bynick-translating complex mixtures of IRS products from each YAC.Mixtures of sufficient complexity were achieved by performingindependent DNA amplifications of total yeast DNA using various Aluprimers (Lichter et al., Proc. Natl. Acad. Sci., USA 87:6634-6638, 1990)and then combining the appropriate reactions containing the most diverseproducts.

C. P1s

Chromosome walking experiments were done using a single set of membraneswhich contained the gridded P1 library pools (Shepherd et al., supra.1994). The gridded filters were kindly provided by Dr. Mark Leppert andthe Technology Access Section of the Utah Center for Human GenomeResearch at the University of Utah. P1 gridded membranes were screenedusing end probes derived from a set of chromosome 16 cosmids (see above)and P1 clones as they were identified. Both RNA transcripts andbubble-PCR products were utilized as end probes.

D. Probes

Radiolabeled transcripts were generated using restriction enzymedigested cosmids or P1s (AluI, HaeIII, RsaI, TaqI) as template for phageRNA polymerases T3, T7 and SP6. The T3 and T7 promoter elements werepresent on the cosmid-derived templates while T7 and SP6 promotersequences were contained on the P1-based templates. Transcriptionreactions were performed as recommended by the manufacturer (Stratagene,La Jolla, Calif.) in the presence of [(αP³² ]-ATP (Amersham, ArlingtonHeights, Ill.).

Bubble-PCR products were synthesized from restriction enzyme digestedP1s (AluI, HaeIII, RsaI, TaqI). Bubble adaptors with appropriateoverhangs and phosphorylated 5' ends were ligated to digested P1 DNAbasically as described for YACs (Riley et al., Nuc. Acids Res.18:2887-2890, 1990). The sequence of the universal vectorette primerderived from the bubble adaptor sequence was 5'-GTTCGTACGAGAATCGCT-3'(SEQ ID NO:67), and differed from that of Riley and co-workers with 12fewer 5' nucleotides. The T_(m) of the truncated vectorette primer moreclosely matched that of the paired amplimer from the vector-derivedpromoter sequence (SP6, T7). The desired bubble-PCR product was gelpurified prior to radiolabeling (Feinberg et al., Anal. Biochem.132:6-13, 1983; Feinberg and Vogelstein, Anal. Biochem. 137:266-267,1984).

The specificity of all end probes was determined prior to their use onthe single set of gridded P1 filter arrays. Radiolabeled probes werepre-annealed to Cot1 DNA as recommended (Life Technologies Inc.,Gaithersburg, Md.) and then hybridized to strips of nylon membrane towhich were bound 10-20 ng each of the following DNAs: the cloned genomictemplate used to create the probe; one or more unrelated cloned genomicDNAs; cloned vector (no insert); and human genomic DNA.

Hybridizations were performed in CAK solution (5× SSPE, 1% SDS, 5×Denhardt's Solution, 100 mg/mL torula RNA) at 65° C. overnight.Individual end probes were present at a concentration of 5×10⁵ cpm/mL.Hybridized membranes were washed to a final stringency of 0.1× SSC/0.1%SDS at 65° C. The hybridization results were visualized byautoradiography. Probes which hybridized robustly to their respectivecloned template while not hybridizing to unrelated cloned DNAs, vectorDNA or genomic DNA were identified and used to screen the gridded P1filters.

Hybridization to the arrayed P1 pools was performed as described for thenylon membrane strips (above) except that multiple probes were usedsimultaneously. Positive clones were identified, plated at a density of200-500 cfu per 100 mm plate (LB plus 25 mg/mL kanamycin), lifted onto82 mm HATF membranes (Millipore, Bedford, Mass.), processed forhybridization (Sambrook et al., supra.) and then rescreened with thecomplex probe mixture.

A single positive clone from each pool was selected and replated onto amaster plate. To identify the colony purified genomic PI clone and itscorresponding probe, multiple PI DNA dot blots were prepared and eachhybridized to individual radiolabeled probes. All hybridizationscontained a chromosome 16p13.3 reference probe, e.g. cAJ42, as well as auniquely labeled P1 DNA probe.

EXAMPLE II Exon Trapping

Genomic P1 clones were prepared for exon trapping experiments bydigestion with PstI, double digestion with BamHI/BglII, or by partialdigestion with limiting amounts of Sau3AI. Digested P1 DNAs were ligatedto BamHI-cut and dephosphorylated vector, pSPL3B, while PstI-digested P1DNA was subcloned into PstI-cut dephosphorylated vector, pSPL3B.

Ligations were performed in triplicate using 50 ng of vector DNA and 1,3 or 6 mass equivalents of digested P1 DNA. Transformations wereperformed following an overnight 16° C. incubation, with 1/10 and 1/2 ofthe transformation being plated on LB (ampicillin) plates. Afterovernight growth at 37° C., colonies were scraped off those plateshaving the highest transformation efficiency (based on a comparison to"no insert" ligation controls) and miniprepped using the alkaline lysismethod. To examine the proportion of the pSPL3B containing insert, asmall portion of the miniprep was digested with HindIII, which cutspSPL3B on each side of the multiple cloning site.

EXAMPLE III RNA Preparation

Approximately 10 μg of the remaining miniprep DNA was ethanolprecipitated, resuspended in 100 μl of sterile PBS and electroporatedinto approximately 2×10⁶ COS-7 cells (in 0.7 ml of ice cold PBS) using aBioRad GenePulser electroporator (1.2 kV, 25 μF and 200Ω). Theelectroporated cells were incubated for 10 min. on ice prior to theiraddition to a 100 mm tissue culture dish containing 10 ml of prewarmedcomplete DMEM.

Cytoplasmic RNA was isolated 48 hours post-transfection. The transfectedCOS-7 cells were removed from tissue culture dishes using 0.25%trypsin/1 mM EDTA (Life Technologies Inc., Gaithersburg, Md.).Trypsinized cells were washed in DMEM/10% FCS and resuspended in 400 μlof ice cold TKM (10 mM Tris-HCl pH 7.5, 10 mM KCl, 1 mM MgCl₂)supplemented with 1 μl of RNAsin (Promega, Madison, Wis.). After adding20 μl of 10% Triton X-100, the cells were incubated for 5 min. on ice.The nuclei were removed by centrifugation at 1200 rpm for 5 min. at 4°C. Thirty microliters of 5% SDS was added to the supernatant, with thecytoplasmic RNA being further purified by three rounds of extractionusing phenol/chloroform/isoamyl alcohol (24:24:1). The cytoplasmic RNAwas ethanol precipitated and resuspended in 50 μl of H₂ O.

Reverse transcription and PCR were performed on the cytoplasmic RNAprepared above as described (Church et al., supra. 1994) usingcommercially available exon trapping oligonucleotides (Life TechnologiesInc., Gaithersburg, Md.). The resulting CUA-tailed products were shotgunsubcloned into pAMP10 as recommended by the manufacturer (LifeTechnologies Inc.). Random clones from each ligation were analyzed bycolony PCR using secondary PCR primers (Life Technologies Inc.).

Miniprep DNA containing the pAMP10/exon traps was prepared fromovernight cultures by alkaline lysis using the EasyPrep manifold or aQIAwell 8 system according to the manufacturers' instructions(Pharmacia, Pistcataway, N.J. and Qiagen Inc., Chatsworth, Calif.,respectively). DNA products containing trapped exons, based oncomparison to the 177 bp "vector only" DNA product, were selected forsequencing.

EXAMPLE IV Sequencing

DNA sequencing was performed using Pharmacia ALF and Applied Biosystems377 PRISM automated DNA sequencers (Piscataway, N.J., and Foster City,Calif.). DNA sequences were aligned using Sequencher DNA analysissoftware (Genecodes, Ann Arbor, Mich.). DNA and protein databasesearches were performed using the BLASTN (Altschul et al., J. Mol. Biol.215:403-410, 1990) and BLASTX (Altschul et al., supra. 1990; Gish etal., Nat. Genet. 3:266-272, 1993) programs. SASE sequences were analyzedby processing BLAST (Altschul et al., supra. 1990; Gish et al., supra.1993) and FASTA (Lipman et al., Science 227:1435-1441, 1985) searches.Protein sequences were analyzed using MacVector (Oxford Molecular Group,Cambell, Calif.), BCM Launcher (Smith et al., Genome Research 6:454-462,1996), ClustalW (Thompson et al., Nucleic Acids Res. 22:4673-4680,1994), and PSORT (Nakai et al., Genomics 14:897-911 1992).

EXAMPLE V RT-PCR, RACE, SASE and cDNA Isolation

Based upon the sequence determined (above) two oligonucleotide primers(Table II) were designed for each exon trap using Oligo 4.0 (NationalBiosciences Inc., Plymouth, Minn.).

To determine which tissue-specific library to screen for transcript orcDNA, RT-PCR reactions and/or PCR reactions were performed usingdifferent tissue-derived RNAs and/or cDNA libraries, respectively, astemplate with the oligonucleotide primers designed for each exon trap(above).

The oligonucleotides designed from the exons (Table II), were then usedin one or more of the following positive selection formats to screen thecorresponding tissue-specific cDNA library.

For RT-PCR experiments, the first oligonucleotide was used as a senseprimer and the second oligonucleotide was used as an antisense primer.RT-PCR was performed as described using polyA⁺ RNA from adult brain andplacenta (Kawasaki, In PCR Protocols: A Guide to Methods andApplications, Eds. Innis et al., Academic Press, San Diego, Calif., pp.21-27, 1990). All PCR products were cloned using the pGEM-T vector asdescribed by the manufacturer (Promega, Madison, Wis.).

To clone sequences 3' to selected exon traps, rapid amplification ofcDNA ends (RACE) was performed as described (Frohman, PCR Met. Appl.4:S40-S58, 1994). In 3' RACE experiments, the first oligonucleotide wasused as the external primer and the second oligonucleotide was used asthe internal primer.

For the Genetrapper cDNA Positive Selection System, the firstoligonucleotide primer was biotinylated and used for direct selection,while the second oligonucleotide was used in the repair.

In addition to exon trapping, the cloned contig was also screened usingcDNA selection essentially as described (Parimoo et al., Anal. Biochem.228:1-17 1995), using the genomic P1 clones from this interval(Dackowski et al., Genome Res. 6:515-524, 1996). Other coding sequencewas obtained by SAmple SEquencing (SASE).

SASE was performed as a functional genomics method for geneidentification. Briefly, DNA from individual Pls were partially digestedwith Sau3A and 3 kb fragments were subcloned into the pBluescriptKS⁺plasmid (Stratagene, La Jolla, Calif.). Subclones were sequenced fromboth ends to generate sequences semi-randomly from the P1 clone.

EXAMPLE VI Nucleotide Sequence Analysis

hNET

A random shotgun library was prepared from the 53.8B P1 clone (FIG. 18)by subcloning randomly sheared P1 DNA into the pAMP10 vector (LifeTechnologies Inc., Gaithersburg, Md.) essentially as described(Andersson et al., (1994) Anal. Biochem. 218:300-308). P1 DNA wasrandomly sheared using a nebulizer (Hudson RCI, Temecula, Calif.). Thelibrary was initially screened with a 6 kb XhoI fragment, which had beenshown to contain the netrin encoding exon traps (FIG. 18). The librarywas subsequently screened with an adjacent 3.5 kb XhoI fragment in orderto obtain additional clones for sequencing. Positive clones weresequenced using forward and reverse vector primers as previouslydescribed (The American PKD1 Consortium (1995) Hum. Mol. Genet.4:575-582).

The genomic sequence was edited and assembled using Sequencher(GeneCodes, Ann Arbor, Mich.). The coding region was predicted using theWorld Wide Web version of the GRAIL2 program (Uberbacher and Mural(1991) Proc. Natl. Acad. Sci., USA 88:11261-11265; Xu et al. (1994)Genet. Eng. N.Y. 16:241-253) and a MacVector (Oxford Molecular Group,Cambell, Calif.) Pustell DNA/protein matrix analysis comparing thegenomic sequence (translated in all reading frames) to the chickennetrins. Database searches were performed using BLASTN (Altschul et al.(1990) J. Mol. Biol. 215:403-410) and BLASTX (Altschul et al., 1990,supra; Gish and States (1993) Nat. Genet. 3:266-272).

RT-PCR: Both adult (brain, heart, kidney, leukocytes, liver, lung, alymphoblastoid cell line, placenta, spleen, and testis) and fetal(kidney and brain) cDNA libraries were prescreened for the presence ofnetrin cDNAs by PCR as described (Van Raay et al., 1996, supra). NestedRT-PCR was utilized to clone transcribed sequences from the netrin gene.Briefly, spinal cord polyA+ RNA (Clontech, Palo Alto, Calif.) wasreverse transcribed using random primers as described (Kawasaki, 1990 In"PCR Protocols: A Guide to Methods and Applications" (M. A. Innis, D. H.Gelfand, J. J. Sninsky, and T. J. White. Eds.), pp. 21-27, AcademicPress, Inc., San Diego).

Primers for PCR (Table IV) were designed based on the exons predictedfrom the analysis of the genomic sequence and used to amplify spinalcord RNA since spinal cord has been previously shown to express lowlevels of chicken netrin (Serafini et al. supra.). Nested PCR wasrequired to detect RT-PCR products from human spinal cord RNA. Spinalcord RNA was reverse transcribed with random primers and primary PCR wasperformed in the presence of 2.5 M betaine (Sigma Chemical Co., St.Louis, Mo.) using the primers designed from the gene model (Table IV).The primary PCR reactions were then diluted 1:20 and secondary PCR wasperformed on 1 μL of the diluted primary reactions using nested primers(also designed from the gene model), again in the presence of betaine.The inclusion of betaine at a final concentration of 2.5 M in the PCRreactions dramatically increased the purity and yield of the humannetrin RT-PCR products (see, for example, International Publication No.WO 96/12041; Reeves et al. (1994) Am. J. Hum. Genet. 55:A238; Baskaranet al. (1996) Genome Research 6:633-638).

RT-PCR products were subcloned using pGEM-T (Promega, Madison, Wis.) asrecommended by the manufacturer. The resulting RT-PCR clones weresequenced with vector primers and internal primers using the ABI dyeterminator chemistry (Perkin Elmer, Foster City, Calif.) and an ABI377automated sequencer (Perkin Elmer, Foster City, Calif.). Multiplesequence alignments were performed using ClustalW (Thompson et al.,(1994) Nucleic Acids Res. 22:4673-4680).

Sequence analysis of the RT-PCR products indicated that hNET contains atleast six exons. The RT-PCR data indicate that the fourth predicted exonis actually split by an intron in the human netrin gene and is presentas two exons. Three of the RT-PCR exons were shown to be identical tothe original exon traps. Aside from the extra exon, the gene model isnearly identical to the RT-PCR products. The cDNA coding sequence,predicted protein product and full length sequence are shown in FIGS. 4Athrough 4C, respectively.

Northern blot analysis: Genomic and RT-PCR probes were radiolabeled(Feinberg and Vogelstein, Anal. Biochem. 132:6-13, 1983) and used toprobe Northern blots containing RNAs from a variety of adult tissues(Clontech, Palo Alto, Calif.), including a panel of RNAs from differentneural tissues including spinal cord. In addition, a human RNA MasterBlot (Clontech, Palo Alto, Calif.) containing RNAs from 50 differentadult and fetal tissues was screened as recommended by the manufacturer.

hABC3

A human lung cDNA library (LTI, Gaithersburg, Md.) was screened with theGeneTrapper system (LTI, Gaithersburg, Md.) using capture and repairoligonucleotides (5'-CATTGCCCGTGCTGTCGTG-3' (SEQ ID NO:52) and5'-CATCGCCGCCTCCTTCATG-3' (SEQ ID NO:53), respectively) designed fromtrapped exon L48757, the 5' most trapped exon with homology to murineABC1. Direct cDNA library screening was also performed using an RT-PCRclone as probe. 5' RACE (Frohman, M. A. in Methods Enzymol. (J. N.Abelson and M. I. Simon Eds.) pp. 340-356, Academic Press, San Diego,Calif., 1993) was used to isolate additional 5' sequences from the ABC3transcript.

Northern blot analysis: A 679 bp fragment from the 3' untranslatedregion (UTR) of the ABC3 cDNA was radiolabeled by random priming(Feinberg et al., supra. 1983) and used to probe a multiple tissuenorthern blot (Clontech, Palo Alto, Calif.) under conditions recommendedby the manufacturer.

Identification of coding sequence for the novel ABC transporter: Thegene for a novel ATP binding cassette (ABC) transporter, designatedABC3, has been mapped to the PKD1 locus on chromosome 16 (Burn et al.,Genome Res. 6:525-537, 1996). Eight exons from the hABC3gene wereobtained from the 30.1F, 64.12C and 96.4B P1 clones using exon trapping.See, FIG. 16 showing the genomic interval surrounding the hABC3 gene atthe top, with NotI sites, DNA markers, and distance in kilobases (in kb)also being shown. Genomic P1 clone s from the interval which containsequence from the hABC3 gene are shown below the genomic map. Therelative position of the hABC3 cDNA is provided below the P1 clones,with the selected cDNA, trapped exons, RT-PCR clones, and cDNAs beingindicated. Trapped exons and RT-PCR clones used in the isolation ofadditional hABC3 sequences have been labeled. The discontinuity in theline for clone ABCgt.1 represents the absence of an alternativelyspliced exon.

Seven of these trapped exons encoded sequences having homology to murineABC1 and ABC2 based on BLASTX analysis (Altschul et al., supra. 1990;Gish et al., supra. 1993), with sequences from the trapped exons L48758,L48759, and L48760 having highest homology. Sequences encoded by thetrapped exon L48760 also had homology to a Caenorhabditis elegans ABCtransporter predicted from genomic sequence (Wilson et al., supra.).

cDNA selection yielded a single 261 bp cDNA clone which mapped near the5' end of the ABC3 gene. Like L48760, this clone encoded sequenceshaving homology to the hypothetical C. elegans ABC transporter. Initialanalysis of the SASE results from the 30.1F P1 clone indicated that 4 ofthe 164 reactions encoded sequences with homology to ABC1 or ABC2.Subsequent comparison of the SASE data to the final hABC3 cDNA indicatedthat an additional seven sequencing reactions contained coding sequencesfrom the ABC3 gene. A total of 1.6 kb of ABC3 coding sequence alignedwith the SASE data. In that only 3.5 kb of coding sequence from the 5'end of the hABC3 gene map to the 30.1F P1 clone, this represents a levelof 45% coverage for the SASE analysis.

Assembly and analysis of a cDNA for the novel ABC transporter: Twocomplementary approaches were employed to assemble the full-lengthhABC3cDNA. First, RT-PCR was utilized to link the trapped exons,selected cDNA, and SASE data. Secondly, cDNA library screening wasperformed using direct selection as well as radiolabeled probes.

Using primers designed from the trapped exons L48757, L48758, L48760 andL75924, three RT-PCR products, containing 3.3 kb of coding sequence werecloned (Table I and FIG. 16). An additional RT-PCR primer was designedfrom a region of identity between the selected cDNA and the SASE data(Table I). A 900 bp RT-PCR clone was obtained using the latter primer inconjunction with a trapped exon derived primer. In total, 4.2 kb ofcoding sequence was obtained using RT-PCR.

Several cDNAs were cloned using the GeneTrapper direct selection systemand oligos designed from the 5' most trapped exon encoding sequenceswith homology to ABC1 (trapped exon L48747). The longest clone isolatedwith the GeneTrapper system was 5719 bp in length (ABCgt.1) (FIG. 8).This cDNA contains a 792 bp 3' untranslated region with a consensuspolyadenylation--cleavage site 20 bp upstream of the polyA tail. Anadditional cDNA clone (ABC.5) was isolated using a radiolabeled 1.1 kbRT-PCR product (ABC3-12) as a probe (FIG. 16). The 5' end of the ABC3cDNA was further characterized using 5' RACE, with several RACE productscontaining multiple in-frame stop codons upstream of the startmethionine.

Sequence analysis indicated that clone ABCgt.1 lacks 147 bp of sequencefound in the RT-PCR clones and the cDNA clone ABC.5. The additional 147bp segment is likely to be the result of alternative splicing, in thatit does not interrupt the open reading frame. The presence of bothtranscript populations has been confirmed by PCR using primers flankingthe alternatively spliced exon.

A 6.4 kb cDNA has been assembled for the hABC3 transporter. Theassembled cDNA contains a 5116 nucleotide long open reading frameencoding 1705 amino acids, with the predicted protein having a molecularweight of 191 kDa. The proposed start methionine is 50 bp upstream ofthe 5' end of clone ABCgt.1. Although the sequence surrounding the startmethionine matches the Kozak sequence in only 6 of 10 positions (Kozak,J. Cell Biol. 115:887-903, 1991), the two positions which have beenshown to be critical for function (an A at -3 and a G at +4) areconserved in hABC3. The hABC3 cDNA contains a 792 bp 3' UTR with aconsensus polyadenylation/cleavage site 20 bp upstream of the polyAtract.

A 6.8 kb transcript is detected by a 3' UTR cDNA probe on northern blotswith highest levels of expression being observed in lung with lesseramounts in brain, heart, and pancreas. Significantly lower levels ofexpression were observed in placenta and skeletal muscle after longerexposure times. The ABC3 transcript was not detected in either liver orkidney.

RPL3L (SEM L3)

The longest cDNA is 1548 nucleotides in length (FIG. 11). All threecDNAs have an open reading frame (ORF) of 1224 nucleotide with thelongest cDNA containing a 48 nucleotide 5' untranslated region. Aninframe stop codon at position 7 is followed by the Kozak initiationsequence CCACCATGT (SEQ ID NO:68) (Kozak, supra.). The 3' UTR for eachof the three cDNAs vary in length, and lacks a consensus polyadenylationcleavage site.

The longest cDNA was compared to the human, bovine and murine ribosomalL3 genes. At the nucleotide level there is only 74% identity between theRPL3L (SEM L3) cDNA and the consensus from these other ribosomal L3cDNAs. This is in sharp contrast to the 98% identity shared betweenhuman, bovine, and murine L3 nucleotide sequences. There is nosimilarity between the 3' UTR of the cDNAs isolated here and the otherL3 genes.

hALR

Sequences were cloned from the human ALR gene by 3' RACE using primers(e.g., external 5'-TGGCCCAGTTCATACATTTA-3' (SEQ ID NO:69) and internal5'-TTACCCCTGTGAGGAGTGTG-3' (SEQ ID NO:70)) designed from the exon trap.A total of 468 bp have been obtained from the human ALR gene (FIG. 13).

EXAMPLE VII Amino Acid Sequence Analysis

hNET

hNET cDNA has at least 210 bp of 5'untranslated sequence, a 5' startmethionine codon, a 3' stop codon (TGA) and is predicted to be 580 aminoacids in length (FIG. 4), with the common domain structure of the netrinfamily being conserved (FIG. 20A). Overall, the human netrin was foundto have higher homology to chicken netrin-2 than netrin-1, i.e., 56.3%versus 53.9%. As is the case with the other members of the netrinfamily, the region of greatest conservation includes the three EGFrepeats, while the C-terminal domains are less well conserved (FIG.20A). The EGF repeats are 78.7% and 82.2% identical between the humannetrin and chicken netrin-1 and netrin-2, respectively, and 66.3%identical when compared to UNC-6. The C-terminal domains of the humannetrin and chicken netrin-1 and -2 are 41.9% and 42.5% indentical,respectively with the same domain of UNC-6 being only 29.4% identical tohuman netrin. Overall, the human netrin more closely resembles thechicken netrins and UNC-6 than Drosophila NETA and NETB, since NETAcontains an expansion in the C-domain while NETB contains additionalsequences in the VI and V-1 domains (Harris et al., 1996, supra;Mitchell et al., 1996, supra).

The Structure of the Netrin Genes is Conserved Between Drosophila andHuman

The positions of the introns in the human gene were compared to theencoded protein to determine if the overall gene structure of thenetrin/UNC-6 family is conserved (FIG. 20B). This analysis revealedstriking similarities between the Drosophila netrin genes and the humannetrin gene. In the human gene, exon 1 contains the signal peptide,domain VI and the first EGF domain (domain V-1), while exons two andthree each contain an EGF repeat, domains V-2 and V-3, respectively.Exons 4, 5, and 6 contain portions of the C-domain. With the exceptionof an additional intron in the C-domain, this motif/exon arrangement isconserved in the Drosophila netrin genes. The coding regions of the twoDrosophila netrin genes have been shown to be highly conserved with eachbeing disrupted by six introns that occur in homologous sites (Harris etal., 1996, supra). The position of five of the six Drosophila intronswas found to be conserved in the human gene (FIG. 20B). The UNC-6 genecontains 12 introns in the coding region (Ishii et al., 1992, supra),the position of five of which correlate with the positions of theintrons in the human gene. Interestingly, the sixth Drosophila intronthat does not have a counterpart in the human gene and is the onlyintron from Drosophila that is not conserved in the UNC-6 gene.

hABC3

Database searches revealed homology between ABC3 and murine ABC1 andABC2 (Luciani et al., supra. 1994). In addition to the murine ABC1 andABC2 proteins, ABC3 also shows homology to the putative C. elegansprotein encoded by the cosmid sequence of C48B4.4 (Wilson et al.,supra.). Overall, ABC3, ABC1, ABC2 and sequences encoded by C. eleganscosmid C48B4.4 have highest homology in the regions surrounding the ATPbinding cassettes (FIG. 17). However, when one compares the sequencebetween the first ATP binding cassette and the second transmembranedomain, referred to as the linker domain (Luciani et al., supra. 1994),ABC3 shares much lower homology to these same 3 proteins listed above(amino acids 765-1044 in ABC3 in FIG. 17). The linker domain of ABC3 isapproximately 200 residues shorter than the linker domain present inABC1 and ABC2. Consequently, an optimum protein alignment positions agap in the ABC3 sequence immediately C-terminal of a conserved HH1hydrophobic domain (Luciani et al., supra. 1994), located at position917 through 959 in ABC3 (FIG. 17). Additional comparisons indicate thatthe ABC3 linker domain is nearly identical in size to the linker domainencoded by C. elegans cosmid C48B4.4. As is the case with ABC1 and ABC2,the linker domain of ABC3 contains numerous polar residues and severalpotential phosphorylation sites.

Further analysis of the deduced ABC3 protein sequence revealedadditional similarities to the ABC1/ABC2 subfamily. Based on PSORTanalysis (Nakai et al., supra.), the ABC3 protein does not appear tocontain an N-terminal signal sequence and is likely to be a Type IIImembrane protein (Singer, Annu. Rev. Cell Biol. 6:247-296 1990), withsequences N-terminal of the first transmembrane domain being located inthe cytoplasm (FIG. 17). Similar topography has been described for ABC1(Luciani et al., supra. 1994) and all other ABC transported described todate (Higgins, supra. 1992). As mentioned above, murine ABC1 and ABC2have been shown to contain a novel hydrophobic region, HH1, within theconserved linker domain. Although the HH1 domain is not well conservedat the amino acid level in ABC3, an HH1 domain does appear to be presentwithin the linker region based on hydrophilicity analysis. A similar HH1domain is also found in sequences encoded by cosmid C48B4.4 from C.elegans. In all these cases, the HH1 domain is predicted to have aβ-sheet conformation.

RPL3L (SEM L3)

The RPL3L (SEM L3) cDNA open reading frame predicts a 407 amino acidpolypeptide of 46.3 kD (FIG. 11). In vitro transcription--translation ofRPL3L (SEM L3) cDNA resulted in a protein product with an apparentmolecular weight of 46 kD which is in close agreement with the predictedweight of 46.3 kD.

Two nuclear targeting sequences, which are 100% conserved between man,mouse and cow, diverged slightly in the RPL3L (SEM L3) amino acidsequence. The first targeting site is the 21 amino acid N-terminaloligopeptide. The serine and arginine present at positions 13 and 19respectively, in human, bovine and murine L3are replaced with histidinesin RPL3L (SEM L3) (FIG. 12). The second potential nuclear targeting siteis the bipartite motif. Here the human, bovine and murine proteins havea KKR-(aa)₁₂ -KRR at position 341-358 while the SEM L3 gene hasKKR-(aa)₁₀ -HHSRQ at position 341-358. The second half of this bipartitemotif, while remaining basic, does not match those found in othernuclear targeting motifs (Simonic et al., supra. 1994). Overall, thereis 77.2% amino acid identity between the RPL3L (SEM L3) and theconsensus from the other mammalian L3 ribosomal genes, with 56% of thenucleotide differences between RPL3L (SEM L3) and the human L3 beingsilent.

hALR

hALR cDNA sequences encode a 119 amino acid protein which is 84.8%identical and 94.1% similar to the rat ALR protein (see, FIGS. 13 and14).

Although the invention has been described with reference to thedisclosed embodiments, it should be understood that variousmodifications can be made without departing from the spirit of theinvention. Accordingly, the invention is limited only by the claimswhich follow the Sequence Listing.

    __________________________________________________________________________    #             SEQUENCE LISTING                                                - (1) GENERAL INFORMATION:                                                    -    (iii) NUMBER OF SEQUENCES: 83                                            - (2) INFORMATION FOR SEQ ID NO:1:                                            -      (i) SEQUENCE CHARACTERISTICS:                                          #acids    (A) LENGTH: 179 amino                                                         (B) TYPE: amino acid                                                          (C) STRANDEDNESS: Not R - #elevant                                            (D) TOPOLOGY: unknown                                               -     (ii) MOLECULE TYPE: peptide                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:1:                                 - Leu His Leu Glu Gly Pro Phe Ile Ser Arg Gl - #u Lys Arg Gly Thr His         #                15                                                           - Pro Glu Ala His Leu Arg Ser Phe Glu Ala As - #p Ala Phe Gln Asp Leu         #            30                                                               - Leu Ala Thr Tyr Gly Pro Leu Asp Asn Val Ar - #g Ile Val Thr Leu Asp         #        45                                                                   - Pro Glu Leu Gly Arg Ser His Glu Val Phe Ar - #g Thr Leu Thr Xaa Arg         #    60                                                                       - Ser Ile Cys Val Ser Leu Gly His Ser Val Al - #a Asp Leu Arg Ala Ala         #80                                                                           - Glu Asp Ala Val Trp Ser Gly Ala Thr Phe Il - #e Thr His Leu Phe Asn         #                95                                                           - Ala Met Leu Pro Phe His His Arg Asp Pro Gl - #y Ile Val Gly Leu Leu         #           110                                                               - Thr Ser Asp Arg Pro Ala Gly Arg Cys Ile Ph - #e Tyr Gly Met Ile Ala         #       125                                                                   - Asp Gly Thr His Thr Asn Pro Ala Ala Leu Ar - #g Ile Ala His Arg Ala         #   140                                                                       - His Pro Gln Gly Leu Val Leu Val Thr Asp Al - #a Ile Pro Ala Leu Gly         145                 1 - #50                 1 - #55                 1 -       #60                                                                           - Leu Gly Asn Gly Arg His Thr Leu Gly Gln Gl - #n Glu Val Glu Val Asp         #               175                                                           - Gly Leu Thr                                                                 - (2) INFORMATION FOR SEQ ID NO:2:                                            -      (i) SEQUENCE CHARACTERISTICS:                                          #acids    (A) LENGTH: 90 amino                                                          (B) TYPE: amino acid                                                          (C) STRANDEDNESS: Not R - #elevant                                            (D) TOPOLOGY: unknown                                               -     (ii) MOLECULE TYPE: peptide                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:2:                                 - His Leu Glu Gly Pro Phe Ile Ser Lys Arg Gl - #y His Pro Glu Ser Tyr         #                15                                                           - Gly Asn Ile Val Thr Pro Glu Leu Glu Val Se - #r Gly His Ser Ala Leu         #            30                                                               - Glu Ala Val Ser Gly Ala Ile Thr His Leu Ph - #e Asn Ala Met His His         #        45                                                                   - Arg Asp Pro Gly Gly Leu Leu Thr Ser Leu Ty - #r Gly Ile Asp Gly His         #    60                                                                       - Thr Ala Leu Arg Ile Ala Gly Leu Val Leu Va - #l Thr Asp Ala Ile Ala         #80                                                                           - Leu Gly Gly His Leu Gly Gln Val Gly Leu                                     #                90                                                           - (2) INFORMATION FOR SEQ ID NO:3:                                            -      (i) SEQUENCE CHARACTERISTICS:                                          #acids    (A) LENGTH: 64 amino                                                          (B) TYPE: amino acid                                                          (C) STRANDEDNESS: Not R - #elevant                                            (D) TOPOLOGY: unknown                                               -     (ii) MOLECULE TYPE: peptide                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:3:                                 - Leu His Leu Glu Gly Pro Lys Gly Thr His Ar - #g Ala Ala Asp Leu Asp         #                15                                                           - Val Thr Leu Pro Glu Glu Val Leu Ile Val Se - #r Gly His Ser Ala Leu         #            30                                                               - Ala Gly Thr Phe Thr His Leu Asn Ala Met Pr - #o Gly Leu Leu Ile Gly         #        45                                                                   - Ile Ala Asp Gly His Ala Arg Ala Arg Leu Le - #u Val Thr Asp Ala Gly         #    60                                                                       - (2) INFORMATION FOR SEQ ID NO:4:                                            -      (i) SEQUENCE CHARACTERISTICS:                                          #acids    (A) LENGTH: 55 amino                                                          (B) TYPE: amino acid                                                          (C) STRANDEDNESS: Not R - #elevant                                            (D) TOPOLOGY: unknown                                               -     (ii) MOLECULE TYPE: peptide                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:4:                                 - Leu His Glu Pro Ser Glu Lys Gly His Arg As - #p Leu Gly Asp Thr Glu         #                15                                                           - Ile Val Ser Gly His Ser Ala Ala Ala Gly Al - #a Thr Phe Thr His Leu         #            30                                                               - Asn Ala Met Pro Gly Gly Ile Asp Gly His As - #n Arg Ile Leu Val Thr         #        45                                                                   - Asp Ile Ala Gly Leu Gly Thr                                                 #    55                                                                       - (2) INFORMATION FOR SEQ ID NO:5:                                            -      (i) SEQUENCE CHARACTERISTICS:                                          #acids    (A) LENGTH: 49 amino                                                          (B) TYPE: amino acid                                                          (C) STRANDEDNESS: Not R - #elevant                                            (D) TOPOLOGY: unknown                                               -     (ii) MOLECULE TYPE: peptide                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:5:                                 - Cys Asp Cys His Pro Val Gly Ala Ala Gly Ly - #s Thr Cys Asn Gln Thr         #                15                                                           - Thr Gly Gln Cys Pro Cys Lys Asp Gly Val Th - #r Gly Leu Thr Cys Asn         #            30                                                               - Arg Cys Ala Pro Gly Phe Gln Gln Ser Arg Se - #r Pro Val Ala Pro Cys         #        45                                                                   - Val                                                                         - (2) INFORMATION FOR SEQ ID NO:6:                                            -      (i) SEQUENCE CHARACTERISTICS:                                          #acids    (A) LENGTH: 48 amino                                                          (B) TYPE: amino acid                                                          (C) STRANDEDNESS: Not R - #elevant                                            (D) TOPOLOGY: unknown                                               -     (ii) MOLECULE TYPE: peptide                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:6:                                 - Cys Asp Cys His Pro Val Gly Ala Ala Gly Ly - #s Thr Cys Asn Gln Thr         #                15                                                           - Thr Gly Gln Cys Pro Cys Lys Asp Gly Val Th - #r Gly Leu Thr Cys Asn         #            30                                                               - Arg Cys Ala Pro Gly Phe Gln Gln Ser Arg Se - #r Pro Val Ala Pro Cys         #        45                                                                   - (2) INFORMATION FOR SEQ ID NO:7:                                            -      (i) SEQUENCE CHARACTERISTICS:                                          #acids    (A) LENGTH: 44 amino                                                          (B) TYPE: amino acid                                                          (C) STRANDEDNESS: Not R - #elevant                                            (D) TOPOLOGY: unknown                                               -     (ii) MOLECULE TYPE: peptide                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:7:                                 - Cys Asp Cys His Pro Val Gly Ala Ala Gly Th - #r Cys Asn Gln Thr Thr         #                15                                                           - Gly Gln Cys Pro Cys Lys Asp Gly Val Thr Gl - #y Thr Cys Asn Arg Cys         #            30                                                               - Ala Lys Gly Gln Gln Ser Arg Ser Pro Ala Pr - #o Cys                         #        40                                                                   - (2) INFORMATION FOR SEQ ID NO:8:                                            -      (i) SEQUENCE CHARACTERISTICS:                                          #acids    (A) LENGTH: 35 amino                                                          (B) TYPE: amino acid                                                          (C) STRANDEDNESS: Not R - #elevant                                            (D) TOPOLOGY: unknown                                               -     (ii) MOLECULE TYPE: peptide                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:8:                                 - Cys Cys His Pro Val Gly Gly Cys Asn Gln Gl - #y Gln Cys Cys Lys Gly         #                15                                                           - Val Thr Gly Thr Cys Asn Arg Cys Ala Lys Gl - #y Gln Gln Ser Arg Ser         #            30                                                               - Val Pro Cys                                                                         35                                                                    - (2) INFORMATION FOR SEQ ID NO:9:                                            -      (i) SEQUENCE CHARACTERISTICS:                                          #acids    (A) LENGTH: 49 amino                                                          (B) TYPE: amino acid                                                          (C) STRANDEDNESS: Not R - #elevant                                            (D) TOPOLOGY: unknown                                               -     (ii) MOLECULE TYPE: peptide                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:9:                                 - His Ser Pro Ser Leu Ser Ala Glu Thr Pro Il - #e Pro Gly Pro Thr Glu         #                15                                                           - Asp Ser Ser Pro Val Gln Pro Gln Asp Cys As - #p Ser His Cys Lys Pro         #            30                                                               - Ala Arg Gly Ser Tyr Arg Ile Ser Leu Lys Ly - #s Phe Cys Lys Lys Asp         #        45                                                                   - Tyr                                                                         - (2) INFORMATION FOR SEQ ID NO:10:                                           -      (i) SEQUENCE CHARACTERISTICS:                                          #acids    (A) LENGTH: 21 amino                                                          (B) TYPE: amino acid                                                          (C) STRANDEDNESS: Not R - #elevant                                            (D) TOPOLOGY: unknown                                               -     (ii) MOLECULE TYPE: peptide                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:10:                                - Ile Ser Pro Asp Cys Asp Ser Cys Lys Pro Al - #a Gly Tyr Ile Lys Lys         #                15                                                           - Cys Lys Lys Asp Tyr                                                                     20                                                                - (2) INFORMATION FOR SEQ ID NO:11:                                           -      (i) SEQUENCE CHARACTERISTICS:                                          #acids    (A) LENGTH: 21 amino                                                          (B) TYPE: amino acid                                                          (C) STRANDEDNESS: Not R - #elevant                                            (D) TOPOLOGY: unknown                                               -     (ii) MOLECULE TYPE: peptide                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:11:                                - Pro Pro Thr Ser Ser Pro Asp Cys Asp Ser Cy - #s Lys Gly Ile Lys Lys         #                15                                                           - Cys Lys Lys Asp Tyr                                                                     20                                                                - (2) INFORMATION FOR SEQ ID NO:12:                                           -      (i) SEQUENCE CHARACTERISTICS:                                          #acids    (A) LENGTH: 88 amino                                                          (B) TYPE: amino acid                                                          (C) STRANDEDNESS: Not R - #elevant                                            (D) TOPOLOGY: unknown                                               -     (ii) MOLECULE TYPE: peptide                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:12:                                - Met Leu Val Gly Asp Ser Gly Val Gly Lys Th - #r Cys Leu Leu Val Arg         #                15                                                           - Phe Lys Asp Gly Ala Phe Leu Ala Gly Thr Ph - #e Ile Ser Thr Val Gly         #            30                                                               - Ile Asp Phe Arg Asn Lys Val Leu Asp Val As - #p Gly Val Lys Ala Lys         #        45                                                                   - Leu Gln Met Trp Asp Thr Ala Gly Gln Glu Ar - #g Phe Arg Ser Val Thr         #    60                                                                       - His Ala Tyr Tyr Arg Asp Ala His Ala Leu Le - #u Leu Leu Tyr Asp Val         #80                                                                           - Thr Asn Lys Ala Ser Phe Asp Asn                                                             85                                                            - (2) INFORMATION FOR SEQ ID NO:13:                                           -      (i) SEQUENCE CHARACTERISTICS:                                          #acids    (A) LENGTH: 83 amino                                                          (B) TYPE: amino acid                                                          (C) STRANDEDNESS: Not R - #elevant                                            (D) TOPOLOGY: unknown                                               -     (ii) MOLECULE TYPE: peptide                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:13:                                - Met Leu Val Gly Asp Ser Gly Val Gly Lys Th - #r Cys Leu Leu Val Arg         #                15                                                           - Phe Lys Asp Gly Ala Phe Leu Ala Gly Thr Ph - #e Ile Ser Thr Val Gly         #            30                                                               - Ile Asp Phe Arg Asn Lys Val Leu Asp Val As - #p Gly Lys Lys Leu Gln         #        45                                                                   - Trp Asp Thr Ala Gly Gln Glu Arg Phe Arg Se - #r Val Thr His Ala Tyr         #    60                                                                       - Tyr Arg Asp Ala His Ala Leu Leu Leu Leu Ty - #r Asp Thr Asn Lys Ser         #80                                                                           - Phe Asp Asn                                                                 - (2) INFORMATION FOR SEQ ID NO:14:                                           -      (i) SEQUENCE CHARACTERISTICS:                                          #acids    (A) LENGTH: 83 amino                                                          (B) TYPE: amino acid                                                          (C) STRANDEDNESS: Not R - #elevant                                            (D) TOPOLOGY: unknown                                               -     (ii) MOLECULE TYPE: peptide                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:14:                                - Phe Gln Asn His Phe Glu Pro Gly Val Tyr Va - #l Cys Ala Lys Cys Gly         #                15                                                           - Tyr Glu Leu Phe Ser Ser Arg Ser Lys Tyr Al - #a His Ser Ser Pro Trp         #            30                                                               - Pro Ala Phe Thr Glu Thr Ile His Ala Asp Se - #r Val Ala Lys Arg Pro         #        45                                                                   - Glu His Asn Arg Ser Glu Ala Leu Lys Val Se - #r Cys Gly Lys Cys Gly         #    60                                                                       - Asn Gly Leu Gly His Glu Phe Leu Asn Asp Gl - #y Pro Lys Pro Gly Gln         #80                                                                           - Ser Arg Phe                                                                 - (2) INFORMATION FOR SEQ ID NO:15:                                           -      (i) SEQUENCE CHARACTERISTICS:                                          #acids    (A) LENGTH: 28 amino                                                          (B) TYPE: amino acid                                                          (C) STRANDEDNESS: Not R - #elevant                                            (D) TOPOLOGY: unknown                                               -     (ii) MOLECULE TYPE: peptide                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:15:                                - Phe Pro Gly Tyr Val Gly Leu Phe Ser Ser Ly - #s Tyr Trp Pro Phe Thr         #                15                                                           - Ile Ala Ser Val Val Leu Gly His Phe Asp Gl - #y Pro                         #            25                                                               - (2) INFORMATION FOR SEQ ID NO:16:                                           -      (i) SEQUENCE CHARACTERISTICS:                                          #acids    (A) LENGTH: 26 amino                                                          (B) TYPE: amino acid                                                          (C) STRANDEDNESS: Not R - #elevant                                            (D) TOPOLOGY: unknown                                               -     (ii) MOLECULE TYPE: peptide                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:16:                                - Glu Gly Val Tyr Cys Ala Cys Asp Leu Ser Se - #r Lys Trp Pro Ala Phe         #                15                                                           - Glu Ala Cys Cys Leu Gly His Phe Gly Lys                                     #            25                                                               - (2) INFORMATION FOR SEQ ID NO:17:                                           -      (i) SEQUENCE CHARACTERISTICS:                                          #acids    (A) LENGTH: 32 amino                                                          (B) TYPE: amino acid                                                          (C) STRANDEDNESS: Not R - #elevant                                            (D) TOPOLOGY: unknown                                               -     (ii) MOLECULE TYPE: peptide                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:17:                                - Phe His Phe Glu Gly Tyr Val Cys Cys Gly Gl - #u Leu Phe Ser Lys Trp         #                15                                                           - Pro Ala Phe Glu Val Cys Cys Leu Gly His Ph - #e Asn Asp Gly Pro Lys         #            30                                                               - (2) INFORMATION FOR SEQ ID NO:18:                                           -      (i) SEQUENCE CHARACTERISTICS:                                          #acids    (A) LENGTH: 28 amino                                                          (B) TYPE: amino acid                                                          (C) STRANDEDNESS: Not R - #elevant                                            (D) TOPOLOGY: unknown                                               -     (ii) MOLECULE TYPE: peptide                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:18:                                - Phe Gly Tyr Val Gly Phe Ser Ser Lys Trp Pr - #o Phe Thr Ile Asp Val         #                15                                                           - Gly Asn Leu Gly His Phe Asp Gly Pro Lys Gl - #y Arg                         #            25                                                               - (2) INFORMATION FOR SEQ ID NO:19:                                           -      (i) SEQUENCE CHARACTERISTICS:                                          #pairs    (A) LENGTH: 6803 base                                                         (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                -     (ii) MOLECULE TYPE: DNA (genomic)                                       -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:19:                                - GGAGCTCGGT TGGAAACCCC CCGAGGCATA ATAGGCGCTC GATAAATGTG CA - #ATAGGTGA         60                                                                          - ACATGTGGTG GCTTGCAGGC GTCTGGGGGG AGACAGCAGG TTCTGGGCTG GG - #CAGGGAAT        120                                                                          - TATTGGATCA ACGGGCATCT TACAGGAAAG ACTCTCAGCT CCCTGCCGCC TA - #GGACTGTC        180                                                                          - CAGCCCATCT ATGCCCTCTC CCCAGCCTGT GCCCCAAAGC TGGAGCTGCC AC - #TCTAGGGG        240                                                                          - TGAGGGGTGG GGTGGGGAGG GGGAGGCGAA GCACTGCGGC CTGAGTTGCA GG - #TGGGGGGA        300                                                                          - GGGGAGGCGG AGCTTCTTTG TTGCAGAAGG TGCCAGGAGG GGGCAGGGCC AG - #TGGAGAGG        360                                                                          - TGGGAGGTGG GAGAGGCCCC AGCCAGGGGC TGGGACAGGT GGCTGGGTCC CT - #GGGGAGCA        420                                                                          - ATAAGTCCCG CTTGGGCGCT GTGGGGAGGC CCTTCCTAAC TCCCAAACAC CA - #TCTGTGAG        480                                                                          - GGCTGGGGGT GGGGGCAGAG TAGCGTGTGC AGAGGACTGT TCCTGGGGAG AG - #GCCCTGTG        540                                                                          - ACCAGCGGCC TCCTCCCTGG GGAGCTGGCG GTACAATGGC CCTCTGGGCC CA - #CGGCCTCC        600                                                                          - CGCCGCTGCT GCTGACCCAG ATGAACAATT GGGGCAGGGC TGAGCCCCAG GC - #ACCTACTT        660                                                                          - TCCCCCACCC CAGAAGCCAC CAGACGTTCT GCAGACCCCA GTCCTGGCTC AC - #AGGGAAGC        720                                                                          - TGAGCTGGAG ACAAAGCCAG CCCCTCTGAT GAGGGTGGAA GAGGCTGCTG GC - #CACTGTCC        780                                                                          - CTCTTGCAGC CTGGCTGGCA GCCAGTCTGG CAGTGGCCCT GACGTCCAGA GA - #CAGCTTGG        840                                                                          - GTTTCCCCAG AGGCTTGTCT CTGGCCAGTG GGACCCCTCT GTCAGGCCTG GG - #CTTTTCTC        900                                                                          - TCCACTGTCC CAGAATGATG ATCTCAGCCC CCATAGTCCC CCCAGGGTTC CT - #CCCACCCT        960                                                                          - TAGGGTGGGG TGTCGGGGGG TGGGGGTTGG GAGCCAGAAG GACCTTGAAG AG - #GGTGGTTG       1020                                                                          - GGACGTTTCA GGTTCTAAGC TTGACCCACA GAGCGGAGCG TGAGCCCCGT CA - #GGTTGAGG       1080                                                                          - TCCCTCAACT TGTAAAGGAC ACAATTCCAT TCTCTTTATC AGGAAGCTGA GG - #GGCAGGGG       1140                                                                          - CCCTGTGGCA GAGAGAGAGC CCCTTAGCCC TCTCTGTTCA GTCCTCCGGT GC - #CCCCATCC       1200                                                                          - CTGTGCATCT GTGGCTGTCA CATGCAGATG TGTGGCAAGG AGAAGGTGCC CA - #CCAGCCAG       1260                                                                          - TGTCAGTTGC TCCAGGAGCC AAGCCAGGTG CCCTATCACC CTGTCTTCCC GT - #TCCTCCCC       1320                                                                          - TCCATGGTCA GGCCCTCCTG CTCCCTCCTC TGGTCCTTCA GTTTCCCCTA GG - #AGGCTTCC       1380                                                                          - GTGTCCTCCT GCCCCTCCTC TCCCCAACAG CGGGATGCGT CTACCTCTCC AT - #TCTCTTCC       1440                                                                          - TCCTGGTCCT TGCTCATCTC TGGTCGTGTC CAGGGTAGCA CCCACGTGGC CT - #CCTCCACC       1500                                                                          - AGCTGCAGGC CTGGCCTCCC ATCTGAAACG GGGCATTCAG GCCTCGATGC TG - #GCCCTGCA       1560                                                                          - CGGAACTTGT TCCCTGCCCC TCCCTGGGAT GCTTGGCCTC CTCTGTCAAG GA - #CCTGAAAG       1620                                                                          - TCGGAGGGGA GGAGGTTTCT CTGACCAGAG CTGTTCCTGG ACCCTCTTTG GT - #GGTGTCGC       1680                                                                          - TCCCAGGCAC AGCTACCCCA TCCCCAGCTA GTCCCCAGGC CACCCAGCTG GG - #CTTCTGCC       1740                                                                          - TCAGTTTCCC TGCCCAAACG TGCTGTGACG TAGGGCAGTG GGCTCCGGGT TG - #CGACCAGC       1800                                                                          - CCCTTCCCAT GATTAAACCC TACTCCCTGC CCCTGCAGAG GGGTCCTCAA CA - #GCTAACCA       1860                                                                          - AGCCCCCGAA CCCCAAGAAG CCACCCCATC CCACCCTCCA GCTTCCATGT CC - #TCCCTGCC       1920                                                                          - AGCTGGGCCC GTGGCAGAGG TGCCCCTAGA AACTTGCAGA CCCAGGGAGC TT - #TGGGATCA       1980                                                                          - GAATCTGGCC TGGTGCAGGG GATGCTGGCC TCATGTCTTA GCCCAGCTCA GG - #CCCATGGG       2040                                                                          - GGTGCCCCCC TTCCTCAACA TGGGCAGGAG ACACTCCAAT TTGTGCAGCT CT - #CGACTTGG       2100                                                                          - GCCTGATGCC ACTTGAGACT CATCAAATCC AACAGCTTCA GAGCGCGTGC TG - #AGTAACAG       2160                                                                          - GCATCTGGCA GGTGAGGAAA CAGGAGCCCA AGACATGCAG CCAGAAATGG GG - #CAGTTGGA       2220                                                                          - TTCAAAATTA GACCTGACCG AATCCTGGGT TCCTTCTACT CGAGTAGATG CT - #GCTTTGGG       2280                                                                          - GATGACCCTT CAACTGGTGG TTACTTGGCT TCCCTACCTG GGGAACATCC AG - #GGCCTCTG       2340                                                                          - CTGTCAGACC CGGGGCCTTG CCTGCCTGAT GGTCTTCAGG GAGGAGGCGA CC - #CAGACCCC       2400                                                                          - CGTCCAGCAC GTGGCACAGC CCCAGGAGCA GTAAAGACCT GGCTGTGGGC CC - #AGGACCCT       2460                                                                          - GCTGGGTGGT CCCCCACGGG CTGCGAAGGC TGAGCTGCCC CCCTCCAGAC CC - #CTCCCGCC       2520                                                                          - AGCGCATTCC TGGCTCCCCG GCCCCTCCCC TGGCTCCCGG GCCTCCCAGC CC - #CCTTCCCC       2580                                                                          - GCTGGCCCAG CCCGCGTCTG AATCTGCTTC TGATTCCAGC TCTGCGATGA GG - #CCCCCTCC       2640                                                                          - CCTCCCCTGC CTCCTTCCCG ACCCGAGCAG CCCCGCCCCC GGCTGGGCCC GG - #GCTTGCGC       2700                                                                          - CTGCTGCGCC CCCCACCCCC TCCTGGCACA GCTCGTCCGC CCTCGCTGCA GC - #CGGGAGGA       2760                                                                          - GGCGGCGGCC CGTGCACCGC AGGCCCCGCC CGCCCACGGC CCTTCCCGGG AG - #GCCGGGAG       2820                                                                          - ACCTGCTCCG CCCGGCCCTC GGTGGGTGAG TGCGAGCGGC GGGTGGGGCC TC - #CGCGGGCG       2880                                                                          - GAGGCACCGG GAGCGGGGGC GACGCCTGTC ATCGCTCTAG GCCCAGCGGG AG - #GACGCGCC       2940                                                                          - AACATCCCCG CTGCTGTGCT GGGCCCGGGG CGTGCCCGCC GCTGCTCCCA CC - #TCTGGGCC       3000                                                                          - GGGCTGGGGC CGCCCGGGGG CCCTGTTCCT CGGCATTGCG GGCCTGGTGG GC - #AGAGCCGC       3060                                                                          - GGAGAGGGCT TCTTTTCCCC AAGGGCAGCG TCTTGGGGCC CGGCCACTGG CT - #GACCCGCA       3120                                                                          - GCGGCTCCGG CCATGCCTGG CTGGCCCTGG GGGCTGCTGC TGACGGCAGG CA - #CGCTCTTC       3180                                                                          - GCCGCCCTGA GTCCTGGGCC GCCGGCGCCC GCCGACCCCT GCCACGATGA GG - #GGGGTGCG       3240                                                                          - CCCCGCGGCT GCGTGCCAGG ACTGGTGAAC GCCGCCCTGG GCCGCGAGGT GC - #TGGCTTCC       3300                                                                          - AGCACGTGCG GGCGGCCGGC CACTCGGGCC TGCGACGCCT CCGACCCGCG AC - #GGGCACAC       3360                                                                          - TCCCCCGCCC TCCTTACTTC CCCAGGGGGC ACGGCCAGCC CTCTGTGCTG GC - #GCTCGGAG       3420                                                                          - TCCCTGCCTC GGGCGCCCCT CAACGTGACT CTCACGGTGC CCCTGGGCAA GG - #CTTTTGAG       3480                                                                          - CTGGTCTTCG TGAGCCTGCG CTTCTGCTCA GCTCCCCCAG CCTCCGTGGC CC - #TGCTCAAG       3540                                                                          - TCTCAGGACC ATGGCCGCAG CTGGGCCCCG CTGGGCTTCT TCTCCTCCCA CT - #GTGACCTG       3600                                                                          - GACTATGGCC GTCTGCCTGC CCCTGCCAAT GGCCCAGCTG GCCCAGGGCC TG - #AGGCCCTG       3660                                                                          - TGCTTCCCCG CACCCCTGGC CCAGCCTGAT GGCAGCGGCC TTCTGGCCTT CA - #GCATGCAG       3720                                                                          - GACAGCAGCC CCCCAGGCCT GGACCTGGAC AGCAGCCCAG TGCTCCAAGA CT - #GGGTGACC       3780                                                                          - GCCACCGACG TCCGTGTAGT GCTCACAAGG CCTAGCACGG CAGGTGACCC CA - #GGGACATG       3840                                                                          - GAGGCCGTCG TCCCTTACTC CTACGCAGCC ACCGACCTCC AGGTGGGCGG GC - #GCTGCAAG       3900                                                                          - TGCAATGGAC ATGCCTCACG GTGCCTGCTG GACACACAGG GCCACCTGAT CT - #GCGACTGT       3960                                                                          - CGGCATGGCA CCGAGGGCCC TGACTGCGGC CGCTGCAAGC CCTTCTACTG CG - #ACAGGCCA       4020                                                                          - TGGCAGCGGG CCACTGCCCG GGAATCCCAC GCCTGCCTCG GTGAGGCCTT GG - #AGGGTGGC       4080                                                                          - CTGGGGACCT TGGACACAAC CAGCCTGCCC CTGACCCATC CCTCCCTGCA GC - #TTGCTCCT       4140                                                                          - GCAACGGCCA TGCCCGCCGC TGCCGCTTCA ACATGGAGCT GTACCGACTG TC - #CGGCCGCC       4200                                                                          - GCAGCGGGGG TGTCTGTCTC AACTGCCGGC ACAACACCGC CGGCCGCCAC TG - #CCACTACT       4260                                                                          - GCCGGGAGGG CTTCTATCGA GACCCTGGCC GTGCCCTGAG TGACCGTCGG GC - #TTGCAGGG       4320                                                                          - GTGAGCCACC ACCGGCCACC TGCAGGCCCT CACCCTCTGA CTTCCCAGAT CC - #CCAGACAG       4380                                                                          - GCTTCTGACC AGGCCCTTCC CACCTCTGTC CTCAGCCTGC GACTGTCACC CG - #GTTGGTGC       4440                                                                          - TGCTGGCAAG ACCTGCAACC AGACCACAGG CCAGTGTCCC TGCAAGGATG GC - #GTCACTGG       4500                                                                          - CCTCACCTGC AACCGCTGCG CGCCTGGCTT CCAGCAAAGC CGCTCCCCAG TG - #GCGCCCTG       4560                                                                          - TGTTAGTGAG TGACCCTGCC CCGCCTCAGC CACCAAGCCA AGGCCACCCC AG - #CTCCCTGC       4620                                                                          - TGTTGTCCCG TCTATTCCCC GAGCCCTGCA GATCTCTCTG CCCCTCCATC GC - #AGGCCATT       4680                                                                          - CTCCCTCCCT CTCTGCAGAG ACCCCTATCC CTGGACCCAC TGAGGACAGC AG - #CCCTGTGC       4740                                                                          - AGCCCCAGGG TGAGTGGACA CAGGACAGGG CCCCAGACTG GCATGACTTT GG - #GGGAGGGG       4800                                                                          - GCTCTGGGAG GAGAGGGTGG GGAAAGGGAG TCTGTGCCAG CCTCCCACCT TC - #TACCCAGA       4860                                                                          - CTGTGACTCG CACTGCAAAC CTGCCCGTGG CAGCTACCGC ATCAGCCTAA AG - #AAGTTCTG       4920                                                                          - CAAGAAGGAC TATGGTAGGT GCCCTCAGGC CTCCCGCGGA CCTTCCCACC TT - #CCTCCTCT       4980                                                                          - CCCTACCTTC CCTCCTCCGC CAGCTTCCCC TTGGAACGCC TTGACCCTTG CT - #GGGCCCCA       5040                                                                          - AGGCCCATCC TCATCCCTCA GGTCCTCCAC GGGCAGCGAC CCCGCCCCTT CA - #GCCCCCAC       5100                                                                          - TGCCCTCCTG GTGTCCTCCC CGTGCCTCCC CCTACCGCGG GCAGGCCGCC CC - #TTCCTGAC       5160                                                                          - CCCGCCCCCT CTCGCTCTCC CCGCAGCGGT GCAGGTGGCG GTGGGTGCGC GC - #GGCGAGGC       5220                                                                          - GCGCGGCGCG TGGACACGCT TCCCGGTGGC GGTGCTCGCC GTGTTCCGGA GC - #GGAGAGGA       5280                                                                          - GCGCGCGCGG CGCGGGAGTA GCGCGCTGTG GGTGCCCGCC GGGGATGCGG CC - #TGCGGCTG       5340                                                                          - CCCGCGCCTG CTCCCCGGCC GCCGCTACCT CCTGCTGGGG GGCGGGCCTG GA - #GCCGCGGC       5400                                                                          - TGGGGGCGCG GGGGGCCGGG GGCCCGGGCT CATCGCCGCC CGCGGAAGCC TC - #GTGCTACC       5460                                                                          - CTGGAGGGAC GCGTGGACGC GGCGCCTGCG GAGGCTGCAG CGACGCGAAC GG - #CGGGGGCG       5520                                                                          - CTGCAGCGCC GCCTGAGCCC GCCGGCTGGG CAGGGCGGCC GCTGCTCCCA CA - #TCTAGGCG       5580                                                                          - CACGTTCACC CTGTGCCTTC GCCTGCCAAG GAGTCCTTGC TCGCGTCGCG CG - #TGTCGCCA       5640                                                                          - CCTGGGCCGC CGCCCCGTCC CCGCCGGCAG CTCCCTCGGT ACCTCCCGTC TG - #GCCCTGGG       5700                                                                          - GGGATGTGAC CGGCGCACGG ACAGCCCGCC CCGCACAGAG GCAGATGATA TG - #GCACACCC       5760                                                                          - GGAGGACCCC ATGGTCTCCC GCCCTCTGGC TGTCGGCCCT GTCCCAGGGG CA - #CTGGGATA       5820                                                                          - CCCGGAAGGC TGTGAATCCT TCGTGATGCC GGGCCCTCTC GGGGATCTCA GA - #TCATCCCC       5880                                                                          - GGGGCCGCTG TGATGCACCC CCACCTGTGC GGCGACCCGC CAGGAGCGCA CT - #GACCTCCC       5940                                                                          - CAAAGACTGT GGCCACCGCA GGCGCCTTGG ACCCCCATGG GGGACAGGGC GT - #CCCCTGCC       6000                                                                          - TCCTGCAGCC CCACGAGGGC GGCGGCCTTG GCCCTGCGGC TGGGCGTCCG CG - #TCCGGGCG       6060                                                                          - CCCCGCGGCG TCTGCTGCCG GGTCCCGTAA CTTTCTTGGC CGCCTGTGTC CC - #CGTCTGCC       6120                                                                          - GGCTCCGTCC GGCCGTCCCT CTCTCTGCCG CGTCTCTGAC CCTCGGCGCC AC - #AGCTCCTC       6180                                                                          - AGCTCAGGGC CCGTCCCAGA ACCTCCTTCC AGCCCTTCTC CCCCGACTCG GG - #AAGGGACG       6240                                                                          - TCGTGCCCAC GCGGTTCCGG ATCCACGCGT GACCCGGCCG GACCGCGACT CC - #GACAGGCG       6300                                                                          - GCTGTCCGGG CCCCCGATGC CCTCGGCAGG GCCGTGCCAC CCCCCGCCCC TT - #GTTGTCCC       6360                                                                          - CCCGGGACCG GCACTGCCGT TTGCCTCCTC TCCGCACGGG ACCGGTTCCC GG - #CCGGCCCC       6420                                                                          - AGCTTCCGCC GCTGCGGCCG CCGACCGTCA GCGCGCATGC CCAGAGCCGG GC - #AGGCCGGA       6480                                                                          - GCCCCGCCGG CTCTCCGGGG TGGGCACAGG GCGACAGCTC GGCGGGGGCG GG - #GCCGAGCA       6540                                                                          - CGCGCGTGCG CAGAAAGGCC GGCGCGGCAG GCTGAGGAGA AAGCGGCGCG CG - #GAGGTGGG       6600                                                                          - TGCGCTCGGG GCGTGCGGGG GGCGCGCGGC GGGGTGGCGG GTGGCGGGGC CG - #GGTCCCCG       6660                                                                          - CTGTCACCGC GGTCGGCGCG TGCTGGGGGC GGGAGCGTGG GGGCCGGGCT GC - #GTGCCCCA       6720                                                                          - TTCGAGGCGG GGATCCCCGG CCACGCGCGG GTTGGGGGCT CCAGAGCCCG GC - #ACCGCCCG       6780                                                                          #              6803TTGG CCT                                                   - (2) INFORMATION FOR SEQ ID NO:20:                                           -      (i) SEQUENCE CHARACTERISTICS:                                          #pairs    (A) LENGTH: 1743 base                                                         (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                -     (ii) MOLECULE TYPE: cDNA                                                -     (ix) FEATURE:                                                                     (A) NAME/KEY: CDS                                                             (B) LOCATION: 1..1740                                               -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:20:                                - ATG CCT GGC TGG CCC TGG GGG CTG CTG CTG AC - #G GCA GGC ACG CTC TTC           48                                                                          Met Pro Gly Trp Pro Trp Gly Leu Leu Leu Th - #r Ala Gly Thr Leu Phe           #                 15                                                          - GCC GCC CTG AGT CCT GGG CCG CCG GCG CCC GC - #C GAC CCC TGC CAC GAT           96                                                                          Ala Ala Leu Ser Pro Gly Pro Pro Ala Pro Al - #a Asp Pro Cys His Asp           #             30                                                              - GAG GGG GGT GCG CCC CGC GGC TGC GTG CCA GG - #A CTG GTG AAC GCC GCC          144                                                                          Glu Gly Gly Ala Pro Arg Gly Cys Val Pro Gl - #y Leu Val Asn Ala Ala           #         45                                                                  - CTG GGC CGC GAG GTG CTG GCT TCC AGC ACG TG - #C GGG CGG CCG GCC ACT          192                                                                          Leu Gly Arg Glu Val Leu Ala Ser Ser Thr Cy - #s Gly Arg Pro Ala Thr           #     60                                                                      - CGG GCC TGC GAC GCC TCC GAC CCG CGA CGG GC - #A CAC TCC CCC GCC CTC          240                                                                          Arg Ala Cys Asp Ala Ser Asp Pro Arg Arg Al - #a His Ser Pro Ala Leu           # 80                                                                          - CTT ACT TCC CCA GGG GGC ACG GCC AGC CCT CT - #G TGC TGG CGC TCG GAG          288                                                                          Leu Thr Ser Pro Gly Gly Thr Ala Ser Pro Le - #u Cys Trp Arg Ser Glu           #                 95                                                          - TCC CTG CCT CGG GCG CCC CTC AAC GTG ACT CT - #C ACG GTG CCC CTG GGC          336                                                                          Ser Leu Pro Arg Ala Pro Leu Asn Val Thr Le - #u Thr Val Pro Leu Gly           #           110                                                               - AAG GCT TTT GAG CTG GTC TTC GTG AGC CTG CG - #C TTC TGC TCA GCT CCC          384                                                                          Lys Ala Phe Glu Leu Val Phe Val Ser Leu Ar - #g Phe Cys Ser Ala Pro           #       125                                                                   - CCA GCC TCC GTG GCC CTG CTC AAG TCT CAG GA - #C CAT GGC CGC AGC TGG          432                                                                          Pro Ala Ser Val Ala Leu Leu Lys Ser Gln As - #p His Gly Arg Ser Trp           #   140                                                                       - GCC CCG CTG GGC TTC TTC TCC TCC CAC TGT GA - #C CTG GAC TAT GGC CGT          480                                                                          Ala Pro Leu Gly Phe Phe Ser Ser His Cys As - #p Leu Asp Tyr Gly Arg           145                 1 - #50                 1 - #55                 1 -       #60                                                                           - CTG CCT GCC CCT GCC AAT GGC CCA GCT GGC CC - #A GGG CCT GAG GCC CTG          528                                                                          Leu Pro Ala Pro Ala Asn Gly Pro Ala Gly Pr - #o Gly Pro Glu Ala Leu           #               175                                                           - TGC TTC CCC GCA CCC CTG GCC CAG CCT GAT GG - #C AGC GGC CTT CTG GCC          576                                                                          Cys Phe Pro Ala Pro Leu Ala Gln Pro Asp Gl - #y Ser Gly Leu Leu Ala           #           190                                                               - TTC AGC ATG CAG GAC AGC AGC CCC CCA GGC CT - #G GAC CTG GAC AGC AGC          624                                                                          Phe Ser Met Gln Asp Ser Ser Pro Pro Gly Le - #u Asp Leu Asp Ser Ser           #       205                                                                   - CCA GTG CTC CAA GAC TGG GTG ACC GCC ACC GA - #C GTC CGT GTA GTG CTC          672                                                                          Pro Val Leu Gln Asp Trp Val Thr Ala Thr As - #p Val Arg Val Val Leu           #   220                                                                       - ACA AGG CCT AGC ACG GCA GGT GAC CCC AGG GA - #C ATG GAG GCC GTC GTC          720                                                                          Thr Arg Pro Ser Thr Ala Gly Asp Pro Arg As - #p Met Glu Ala Val Val           225                 2 - #30                 2 - #35                 2 -       #40                                                                           - CCT TAC TCC TAC GCA GCC ACC GAC CTC CAG GT - #G GGC GGG CGC TGC AAG          768                                                                          Pro Tyr Ser Tyr Ala Ala Thr Asp Leu Gln Va - #l Gly Gly Arg Cys Lys           #               255                                                           - TGC AAT GGA CAT GCC TCA CGG TGC CTG CTG GA - #C ACA CAG GGC CAC CTG          816                                                                          Cys Asn Gly His Ala Ser Arg Cys Leu Leu As - #p Thr Gln Gly His Leu           #           270                                                               - ATC TGC GAC TGT CGG CAT GGC ACC GAG GGC CC - #T GAC TGC GGC CGC TGC          864                                                                          Ile Cys Asp Cys Arg His Gly Thr Glu Gly Pr - #o Asp Cys Gly Arg Cys           #       285                                                                   - AAG CCC TTC TAC TGC GAC AGG CCA TGG CAG CG - #G GCC ACT GCC CGG GAA          912                                                                          Lys Pro Phe Tyr Cys Asp Arg Pro Trp Gln Ar - #g Ala Thr Ala Arg Glu           #   300                                                                       - TCC CAC GCC TGC CTC GCT TGC TCC TGC AAC GG - #C CAT GCC CGC CGC TGC          960                                                                          Ser His Ala Cys Leu Ala Cys Ser Cys Asn Gl - #y His Ala Arg Arg Cys           305                 3 - #10                 3 - #15                 3 -       #20                                                                           - CGC TTC AAC ATG GAG CTG TAC CGA CTG TCC GG - #C CGC CGC AGC GGG GGT         1008                                                                          Arg Phe Asn Met Glu Leu Tyr Arg Leu Ser Gl - #y Arg Arg Ser Gly Gly           #               335                                                           - GTC TGT CTC AAC TGC CGG CAC AAC ACC GCC GG - #C CGC CAC TGC CAC TAC         1056                                                                          Val Cys Leu Asn Cys Arg His Asn Thr Ala Gl - #y Arg His Cys His Tyr           #           350                                                               - TGC CGG GAG GGC TTC TAT CGA GAC CCT GGC CG - #T GCC CTG AGT GAC CGT         1104                                                                          Cys Arg Glu Gly Phe Tyr Arg Asp Pro Gly Ar - #g Ala Leu Ser Asp Arg           #       365                                                                   - CGG GCT TGC AGG GCC TGC GAC TGT CAC CCG GT - #T GGT GCT GCT GGC AAG         1152                                                                          Arg Ala Cys Arg Ala Cys Asp Cys His Pro Va - #l Gly Ala Ala Gly Lys           #   380                                                                       - ACC TGC AAC CAG ACC ACA GGC CAG TGT CCC TG - #C AAG GAT GGC GTC ACT         1200                                                                          Thr Cys Asn Gln Thr Thr Gly Gln Cys Pro Cy - #s Lys Asp Gly Val Thr           385                 3 - #90                 3 - #95                 4 -       #00                                                                           - GGC CTC ACC TGC AAC CGC TGC GCG CCT GGC TT - #C CAG CAA AGC CGC TCC         1248                                                                          Gly Leu Thr Cys Asn Arg Cys Ala Pro Gly Ph - #e Gln Gln Ser Arg Ser           #               415                                                           - CCA GTG GCG CCC TGT GTT AAG ACC CCT ATC CC - #T GGA CCC ACT GAG GAC         1296                                                                          Pro Val Ala Pro Cys Val Lys Thr Pro Ile Pr - #o Gly Pro Thr Glu Asp           #           430                                                               - AGC AGC CCT GTG CAG CCC CAG GAC TGT GAC TC - #G CAC TGC AAA CCT GCC         1344                                                                          Ser Ser Pro Val Gln Pro Gln Asp Cys Asp Se - #r His Cys Lys Pro Ala           #       445                                                                   - CGT GGC AGC TAC CGC ATC AGC CTA AAG AAG TT - #C TGC AAG AAG GAC TAT         1392                                                                          Arg Gly Ser Tyr Arg Ile Ser Leu Lys Lys Ph - #e Cys Lys Lys Asp Tyr           #   460                                                                       - GCG GTG CAG GTG GCG GTG GGT GCG CGC GGC GA - #G GCG CGC GGC GCG TGG         1440                                                                          Ala Val Gln Val Ala Val Gly Ala Arg Gly Gl - #u Ala Arg Gly Ala Trp           465                 4 - #70                 4 - #75                 4 -       #80                                                                           - ACA CGC TTC CCG GTG GCG GTG CTC GCC GTG TT - #C CGG AGC GGA GAG GAG         1488                                                                          Thr Arg Phe Pro Val Ala Val Leu Ala Val Ph - #e Arg Ser Gly Glu Glu           #               495                                                           - CGC GCG CGG CGC GGG AGT AGC GCG CTG TGG GT - #G CCC GCC GGG GAT GCG         1536                                                                          Arg Ala Arg Arg Gly Ser Ser Ala Leu Trp Va - #l Pro Ala Gly Asp Ala           #           510                                                               - GCC TGC GGC TGC CCG CGC CTG CTC CCC GGC CG - #C CGC TAC CTC CTG CTG         1584                                                                          Ala Cys Gly Cys Pro Arg Leu Leu Pro Gly Ar - #g Arg Tyr Leu Leu Leu           #       525                                                                   - GGG GGC GGG CCT GGA GCC GCG GCT GGG GGC GC - #G GGG GGC CGG GGG CCC         1632                                                                          Gly Gly Gly Pro Gly Ala Ala Ala Gly Gly Al - #a Gly Gly Arg Gly Pro           #   540                                                                       - GGG CTC ATC GCC GCC CGC GGA AGC CTC GTG CT - #A CCC TGG AGG GAC GCG         1680                                                                          Gly Leu Ile Ala Ala Arg Gly Ser Leu Val Le - #u Pro Trp Arg Asp Ala           545                 5 - #50                 5 - #55                 5 -       #60                                                                           - TGG ACG CGG CGC CTG CGG AGG CTG CAG CGA CG - #C GAA CGG CGG GGG CGC         1728                                                                          Trp Thr Arg Arg Leu Arg Arg Leu Gln Arg Ar - #g Glu Arg Arg Gly Arg           #               575                                                           #  1743            GA                                                         Cys Ser Ala Ala                                                                           580                                                               - (2) INFORMATION FOR SEQ ID NO:21:                                           -      (i) SEQUENCE CHARACTERISTICS:                                          #acids    (A) LENGTH: 580 amino                                                         (B) TYPE: amino acid                                                          (D) TOPOLOGY: linear                                                -     (ii) MOLECULE TYPE: protein                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:21:                                - Met Pro Gly Trp Pro Trp Gly Leu Leu Leu Th - #r Ala Gly Thr Leu Phe         #                 15                                                          - Ala Ala Leu Ser Pro Gly Pro Pro Ala Pro Al - #a Asp Pro Cys His Asp         #             30                                                              - Glu Gly Gly Ala Pro Arg Gly Cys Val Pro Gl - #y Leu Val Asn Ala Ala         #         45                                                                  - Leu Gly Arg Glu Val Leu Ala Ser Ser Thr Cy - #s Gly Arg Pro Ala Thr         #     60                                                                      - Arg Ala Cys Asp Ala Ser Asp Pro Arg Arg Al - #a His Ser Pro Ala Leu         # 80                                                                          - Leu Thr Ser Pro Gly Gly Thr Ala Ser Pro Le - #u Cys Trp Arg Ser Glu         #                 95                                                          - Ser Leu Pro Arg Ala Pro Leu Asn Val Thr Le - #u Thr Val Pro Leu Gly         #           110                                                               - Lys Ala Phe Glu Leu Val Phe Val Ser Leu Ar - #g Phe Cys Ser Ala Pro         #       125                                                                   - Pro Ala Ser Val Ala Leu Leu Lys Ser Gln As - #p His Gly Arg Ser Trp         #   140                                                                       - Ala Pro Leu Gly Phe Phe Ser Ser His Cys As - #p Leu Asp Tyr Gly Arg         145                 1 - #50                 1 - #55                 1 -       #60                                                                           - Leu Pro Ala Pro Ala Asn Gly Pro Ala Gly Pr - #o Gly Pro Glu Ala Leu         #               175                                                           - Cys Phe Pro Ala Pro Leu Ala Gln Pro Asp Gl - #y Ser Gly Leu Leu Ala         #           190                                                               - Phe Ser Met Gln Asp Ser Ser Pro Pro Gly Le - #u Asp Leu Asp Ser Ser         #       205                                                                   - Pro Val Leu Gln Asp Trp Val Thr Ala Thr As - #p Val Arg Val Val Leu         #   220                                                                       - Thr Arg Pro Ser Thr Ala Gly Asp Pro Arg As - #p Met Glu Ala Val Val         225                 2 - #30                 2 - #35                 2 -       #40                                                                           - Pro Tyr Ser Tyr Ala Ala Thr Asp Leu Gln Va - #l Gly Gly Arg Cys Lys         #               255                                                           - Cys Asn Gly His Ala Ser Arg Cys Leu Leu As - #p Thr Gln Gly His Leu         #           270                                                               - Ile Cys Asp Cys Arg His Gly Thr Glu Gly Pr - #o Asp Cys Gly Arg Cys         #       285                                                                   - Lys Pro Phe Tyr Cys Asp Arg Pro Trp Gln Ar - #g Ala Thr Ala Arg Glu         #   300                                                                       - Ser His Ala Cys Leu Ala Cys Ser Cys Asn Gl - #y His Ala Arg Arg Cys         305                 3 - #10                 3 - #15                 3 -       #20                                                                           - Arg Phe Asn Met Glu Leu Tyr Arg Leu Ser Gl - #y Arg Arg Ser Gly Gly         #               335                                                           - Val Cys Leu Asn Cys Arg His Asn Thr Ala Gl - #y Arg His Cys His Tyr         #           350                                                               - Cys Arg Glu Gly Phe Tyr Arg Asp Pro Gly Ar - #g Ala Leu Ser Asp Arg         #       365                                                                   - Arg Ala Cys Arg Ala Cys Asp Cys His Pro Va - #l Gly Ala Ala Gly Lys         #   380                                                                       - Thr Cys Asn Gln Thr Thr Gly Gln Cys Pro Cy - #s Lys Asp Gly Val Thr         385                 3 - #90                 3 - #95                 4 -       #00                                                                           - Gly Leu Thr Cys Asn Arg Cys Ala Pro Gly Ph - #e Gln Gln Ser Arg Ser         #               415                                                           - Pro Val Ala Pro Cys Val Lys Thr Pro Ile Pr - #o Gly Pro Thr Glu Asp         #           430                                                               - Ser Ser Pro Val Gln Pro Gln Asp Cys Asp Se - #r His Cys Lys Pro Ala         #       445                                                                   - Arg Gly Ser Tyr Arg Ile Ser Leu Lys Lys Ph - #e Cys Lys Lys Asp Tyr         #   460                                                                       - Ala Val Gln Val Ala Val Gly Ala Arg Gly Gl - #u Ala Arg Gly Ala Trp         465                 4 - #70                 4 - #75                 4 -       #80                                                                           - Thr Arg Phe Pro Val Ala Val Leu Ala Val Ph - #e Arg Ser Gly Glu Glu         #               495                                                           - Arg Ala Arg Arg Gly Ser Ser Ala Leu Trp Va - #l Pro Ala Gly Asp Ala         #           510                                                               - Ala Cys Gly Cys Pro Arg Leu Leu Pro Gly Ar - #g Arg Tyr Leu Leu Leu         #       525                                                                   - Gly Gly Gly Pro Gly Ala Ala Ala Gly Gly Al - #a Gly Gly Arg Gly Pro         #   540                                                                       - Gly Leu Ile Ala Ala Arg Gly Ser Leu Val Le - #u Pro Trp Arg Asp Ala         545                 5 - #50                 5 - #55                 5 -       #60                                                                           - Trp Thr Arg Arg Leu Arg Arg Leu Gln Arg Ar - #g Glu Arg Arg Gly Arg         #               575                                                           - Cys Ser Ala Ala                                                                         580                                                               - (2) INFORMATION FOR SEQ ID NO:22:                                           -      (i) SEQUENCE CHARACTERISTICS:                                          #acids    (A) LENGTH: 606 amino                                                         (B) TYPE: amino acid                                                          (C) STRANDEDNESS: Not R - #elevant                                            (D) TOPOLOGY: unknown                                               -     (ii) MOLECULE TYPE: protein                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:22:                                - Met Pro Arg Arg Gly Ala Glu Gly Pro Leu Al - #a Leu Leu Leu Ala Ala         #                15                                                           - Ala Trp Leu Ala Gln Pro Leu Arg Gly Gly Ty - #r Pro Gly Leu Asn Met         #            30                                                               - Phe Ala Val Gln Thr Ala Gln Pro Asp Pro Cy - #s Tyr Asp Glu His Gly         #        45                                                                   - Leu Pro Arg Arg Cys Ile Pro Asp Phe Val As - #n Ser Ala Phe Gly Lys         #    60                                                                       - Glu Val Lys Val Ser Ser Thr Cys Gly Lys Pr - #o Pro Ser Arg Tyr Cys         #80                                                                           - Val Val Thr Glu Lys Gly Glu Glu Gln Val Ar - #g Ser Cys His Leu Cys         #                95                                                           - Asn Ala Ser Asp Pro Lys Arg Ala His Pro Pr - #o Ser Phe Leu Thr Asp         #           110                                                               - Leu Asn Asn Pro His Asn Leu Thr Cys Trp Gl - #n Ser Asp Ser Tyr Val         #       125                                                                   - Gln Tyr Pro His Asn Val Thr Leu Thr Leu Se - #r Leu Gly Lys Lys Phe         #   140                                                                       - Glu Val Thr Tyr Val Ser Leu Gln Phe Cys Se - #r Pro Arg Pro Glu Ser         145                 1 - #50                 1 - #55                 1 -       #60                                                                           - Met Ala Ile Tyr Lys Ser Met Asp Tyr Gly Ly - #s Thr Trp Val Pro Phe         #               175                                                           - Gln Phe Tyr Ser Thr Gln Cys Arg Lys Met Ty - #r Asn Lys Pro Ser Arg         #           190                                                               - Ala Ala Ile Thr Lys Gln Asn Glu Gln Glu Al - #a Ile Cys Thr Asp Ser         #       205                                                                   - His Thr Asp Val Arg Pro Leu Ser Gly Gly Le - #u Ile Ala Phe Ser Thr         #   220                                                                       - Leu Asp Gly Arg Pro Thr Ala His Asp Phe As - #p Asn Ser Pro Val Leu         225                 2 - #30                 2 - #35                 2 -       #40                                                                           - Gln Asp Trp Val Thr Ala Thr Asp Ile Lys Va - #l Thr Phe Ser Arg Leu         #               255                                                           - His Thr Phe Gly Asp Glu Asn Glu Asp Asp Se - #r Glu Leu Ala Arg Asp         #           270                                                               - Ser Tyr Phe Tyr Ala Val Ser Asp Leu Gln Va - #l Gly Gly Arg Cys Lys         #       285                                                                   - Cys Asn Gly His Ala Ser Arg Cys Val Arg As - #p Arg Asp Asp Asn Leu         #   300                                                                       - Val Cys Asp Cys Lys His Asn Thr Ala Gly Pr - #o Glu Cys Asp Arg Cys         305                 3 - #10                 3 - #15                 3 -       #20                                                                           - Lys Pro Phe His Tyr Asp Arg Pro Trp Gln Ar - #g Ala Thr Ala Arg Glu         #               335                                                           - Ala Asn Glu Cys Val Ala Cys Asn Cys Asn Le - #u His Ala Arg Arg Cys         #           350                                                               - Arg Phe Asn Met Glu Leu Tyr Lys Leu Ser Gl - #y Arg Lys Ser Gly Gly         #       365                                                                   - Val Cys Leu Asn Cys Arg His Asn Thr Ala Gl - #y Arg His Cys His Tyr         #   380                                                                       - Cys Lys Glu Gly Phe Tyr Arg Asp Leu Ser Ly - #s Pro Ile Ser His Arg         385                 3 - #90                 3 - #95                 4 -       #00                                                                           - Lys Ala Cys Lys Glu Cys Asp Cys His Pro Va - #l Gly Ala Ala Gly Gln         #               415                                                           - Thr Cys Asn Gln Thr Thr Gly Gln Cys Pro Cy - #s Lys Asp Gly Val Thr         #           430                                                               - Gly Ile Thr Cys Asn Arg Cys Ala Lys Gly Ty - #r Gln Gln Ser Arg Ser         #       445                                                                   - Pro Ile Ala Pro Cys Ile Lys Ile Pro Ala Al - #a Pro Pro Pro Thr Ala         #   460                                                                       - Ala Ser Ser Thr Glu Glu Pro Ala Asp Cys As - #p Ser Tyr Cys Lys Ala         465                 4 - #70                 4 - #75                 4 -       #80                                                                           - Ser Lys Gly Lys Leu Lys Ile Asn Met Lys Ly - #s Tyr Cys Lys Lys Asp         #               495                                                           - Tyr Ala Val Gln Ile His Ile Leu Lys Ala Gl - #u Lys Asn Ala Asp Trp         #           510                                                               - Trp Lys Phe Thr Val Asn Ile Ile Ser Val Ty - #r Lys Gln Gly Ser Asn         #       525                                                                   - Arg Leu Arg Arg Gly Asp Gln Thr Leu Trp Va - #l His Ala Lys Asp Ile         #   540                                                                       - Ala Cys Lys Cys Pro Lys Val Lys Pro Met Ly - #s Lys Tyr Leu Leu Leu         545                 5 - #50                 5 - #55                 5 -       #60                                                                           - Gly Ser Thr Glu Asp Ser Pro Asp Gln Ser Gl - #y Ile Ile Ala Asp Lys         #               575                                                           - Ser Ser Leu Val Ile Gln Trp Arg Asp Thr Tr - #p Ala Arg Arg Leu Arg         #           590                                                               - Lys Phe Gln Gln Arg Glu Lys Lys Gly Lys Cy - #s Arg Lys Ala                 #       605                                                                   - (2) INFORMATION FOR SEQ ID NO:23:                                           -      (i) SEQUENCE CHARACTERISTICS:                                          #acids    (A) LENGTH: 581 amino                                                         (B) TYPE: amino acid                                                          (C) STRANDEDNESS: Not R - #elevant                                            (D) TOPOLOGY: unknown                                               -     (ii) MOLECULE TYPE: protein                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:23:                                - Leu Arg Leu Leu Leu Thr Thr Ser Val Leu Ar - #g Leu Ala Arg Ala Ala         #                15                                                           - Asn Pro Glu Val Ala Gln Gln Thr Pro Pro As - #p Pro Cys Tyr Asp Glu         #            30                                                               - Ser Gly Ala Pro Arg Arg Cys Ile Pro Glu Ph - #e Val Asn Ala Ala Phe         #        45                                                                   - Gly Lys Glu Val Gln Ala Ser Ser Thr Cys Gl - #y Lys Pro Pro Thr Arg         #    60                                                                       - His Cys Asp Ala Ser Asp Pro Arg Arg Ala Hi - #s Pro Pro Ala Tyr Leu         #80                                                                           - Thr Asp Leu Asn Thr Ala Ala Asn Met Thr Cy - #s Trp Arg Ser Glu Thr         #                95                                                           - Leu His His Leu Pro His Asn Val Thr Leu Th - #r Leu Ser Leu Gly Lys         #           110                                                               - Lys Phe Glu Val Val Tyr Val Ser Leu Gln Ph - #e Cys Ser Pro Arg Pro         #       125                                                                   - Glu Ser Thr Ala Ile Phe Lys Ser Met Asp Ty - #r Gly Lys Thr Trp Val         #   140                                                                       - Pro Tyr Gln Tyr Tyr Ser Ser Gln Cys Arg Ly - #s Ile Tyr Gly Lys Pro         145                 1 - #50                 1 - #55                 1 -       #60                                                                           - Ser Lys Ala Thr Val Thr Lys Gln Asn Glu Gl - #n Glu Ala Leu Cys Thr         #               175                                                           - Asp Gly Leu Thr Asp Leu Tyr Pro Leu Thr Gl - #y Gly Leu Ile Ala Phe         #           190                                                               - Ser Thr Leu Asp Gly Arg Pro Ser Ala Gln As - #p Phe Asp Ser Ser Pro         #       205                                                                   - Val Leu Gln Asp Trp Val Thr Ala Thr Asp Il - #e Arg Val Val Phe Ser         #   220                                                                       - Arg Pro His Leu Phe Arg Glu Leu Gly Gly Ar - #g Glu Ala Gly Glu Glu         225                 2 - #30                 2 - #35                 2 -       #40                                                                           - Asp Gly Gly Ala Gly Ala Thr Pro Tyr Tyr Ty - #r Ser Val Gly Glu Leu         #               255                                                           - Gln Val Gly Gly Arg Cys Lys Cys Asn Gly Hi - #s Ala Ser Arg Cys Val         #           270                                                               - Lys Asp Lys Glu Gln Lys Leu Val Cys Asp Cy - #s Lys His Asn Thr Glu         #       285                                                                   - Gly Pro Glu Cys Asp Arg Cys Lys Pro Phe Hi - #s Tyr Asp Arg Pro Trp         #   300                                                                       - Gln Arg Ala Ser Ala Arg Glu Ala Asn Glu Cy - #s Leu Ala Cys Asn Cys         305                 3 - #10                 3 - #15                 3 -       #20                                                                           - Asn Leu His Ala Arg Arg Cys Arg Phe Asn Me - #t Glu Leu Tyr Lys Leu         #               335                                                           - Ser Gly Arg Lys Ser Gly Gly Val Cys Leu As - #n Cys Arg His Asn Thr         #           350                                                               - Ala Gly Arg His Cys His Tyr Cys Lys Glu Gl - #y Phe Tyr Arg Asp Leu         #       365                                                                   - Ser Lys Ser Ile Thr Asp Arg Lys Ala Cys Ly - #s Ala Cys Asp Cys His         #   380                                                                       - Pro Val Gly Ala Ala Gly Lys Thr Cys Asn Gl - #n Thr Thr Gly Gln Cys         385                 3 - #90                 3 - #95                 4 -       #00                                                                           - Pro Cys Lys Asp Gly Val Thr Gly Leu Thr Cy - #s Asn Arg Cys Ala Lys         #               415                                                           - Gly Phe Gln Gln Ser Arg Ser Pro Val Ala Pr - #o Cys Ile Lys Ile Pro         #           430                                                               - Ala Ile Asn Pro Thr Ser Leu Val Thr Ser Th - #r Glu Ala Pro Ala Asp         #       445                                                                   - Cys Asp Ser Tyr Cys Lys Pro Ala Lys Gly As - #n Tyr Lys Ile Asn Met         #   460                                                                       - Lys Lys Tyr Cys Lys Lys Asp Tyr Val Val Gl - #n Val Asn Ile Leu Glu         465                 4 - #70                 4 - #75                 4 -       #80                                                                           - Met Glu Thr Val Ala Asn Trp Ala Lys Phe Th - #r Ile Asn Ile Leu Ser         #               495                                                           - Val Tyr Lys Cys Arg Asp Glu Arg Val Lys Ar - #g Gly Asp Asn Phe Leu         #           510                                                               - Trp Ile His Leu Lys Asp Leu Ser Cys Lys Cy - #s Pro Lys Ile Gln Ile         #       525                                                                   - Ser Lys Lys Tyr Leu Val Met Gly Ile Ser Gl - #u Asn Ser Thr Asp Arg         #   540                                                                       - Pro Gly Leu Met Ala Asp Lys Asn Ser Leu Va - #l Ile Gln Trp Arg Asp         545                 5 - #50                 5 - #55                 5 -       #60                                                                           - Ala Trp Thr Arg Arg Leu Arg Lys Leu Gln Ar - #g Arg Glu Lys Lys Gly         #               575                                                           - Lys Cys Val Lys Pro                                                                     580                                                               - (2) INFORMATION FOR SEQ ID NO:24:                                           -      (i) SEQUENCE CHARACTERISTICS:                                          #pairs    (A) LENGTH: 5894 base                                                         (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                -     (ii) MOLECULE TYPE: cDNA                                                -     (ix) FEATURE:                                                                     (A) NAME/KEY: CDS                                                             (B) LOCATION: 2..5053                                               -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:24:                                #CTG CCA TTG CTG TTT         46 GAA CTC TTC                                     Lys Val Leu Val Thr Val Leu Glu Leu P - #he Leu Pro Leu Leu Phe             # 15                                                                          - TCT GGG ATC CTC ATC TGG CTC CGC TTG AAG AT - #T CAG TCG GAA AAT GTG           94                                                                          Ser Gly Ile Leu Ile Trp Leu Arg Leu Lys Il - #e Gln Ser Glu Asn Val           #                 30                                                          - CCC AAC GCC ACC ATC TAC CCG GGC CAG TCC AT - #C CAG GAG CTG CCT CTG          142                                                                          Pro Asn Ala Thr Ile Tyr Pro Gly Gln Ser Il - #e Gln Glu Leu Pro Leu           #             45                                                              - TTC TTC ACC TTC CCT CCG CCA GGA GAC ACC TG - #G GAG CTT GCC TAC ATC          190                                                                          Phe Phe Thr Phe Pro Pro Pro Gly Asp Thr Tr - #p Glu Leu Ala Tyr Ile           #         60                                                                  - CCT TCT CAC AGT GAC GCT GCC AAG GCC GTC AC - #T GAG ACA GTG CGC AGG          238                                                                          Pro Ser His Ser Asp Ala Ala Lys Ala Val Th - #r Glu Thr Val Arg Arg           #     75                                                                      - GCA CTT GTG ATC AAC ATG CGA GTG CGC GGC TT - #T CCC TCC GAG AAG GAC          286                                                                          Ala Leu Val Ile Asn Met Arg Val Arg Gly Ph - #e Pro Ser Glu Lys Asp           # 95                                                                          - TTT GAG GAC TAC ATT AGG TAC GAC AAC TGC TC - #G TCC AGC GTG CTG GCC          334                                                                          Phe Glu Asp Tyr Ile Arg Tyr Asp Asn Cys Se - #r Ser Ser Val Leu Ala           #               110                                                           - GCC GTG GTC TTC GAG CAC CCC TTC AAC CAC AG - #C AAG GAG CCC CTG CCG          382                                                                          Ala Val Val Phe Glu His Pro Phe Asn His Se - #r Lys Glu Pro Leu Pro           #           125                                                               - CTG GCG GTG AAA TAT CAC CTA CGG TTC AGT TA - #C ACA CGG AGA AAT TAC          430                                                                          Leu Ala Val Lys Tyr His Leu Arg Phe Ser Ty - #r Thr Arg Arg Asn Tyr           #       140                                                                   - ATG TGG ACC CAA ACA GGC TCC TTT TTC CTG AA - #A GAG ACA GAA GGC TGG          478                                                                          Met Trp Thr Gln Thr Gly Ser Phe Phe Leu Ly - #s Glu Thr Glu Gly Trp           #   155                                                                       - CAC ACT ACT TCC CTT TTC CCG CTT TTC CCA AA - #C CCA GGA CCA AGG GAA          526                                                                          His Thr Thr Ser Leu Phe Pro Leu Phe Pro As - #n Pro Gly Pro Arg Glu           160                 1 - #65                 1 - #70                 1 -       #75                                                                           - CTA ACA TCC CCT GAT GGC GGA GAA CCT GGG TA - #C ATC CGG GAA GGC TTC          574                                                                          Leu Thr Ser Pro Asp Gly Gly Glu Pro Gly Ty - #r Ile Arg Glu Gly Phe           #               190                                                           - CTG GCC GTG CAG CAT GCT GTG GAC CGG GCC AT - #C ATG GAG TAC CAT GCC          622                                                                          Leu Ala Val Gln His Ala Val Asp Arg Ala Il - #e Met Glu Tyr His Ala           #           205                                                               - GAT GCC GCC ACA CGC CAG CTG TTC CAG AGA CT - #G ACG GTG ACC ATC AAG          670                                                                          Asp Ala Ala Thr Arg Gln Leu Phe Gln Arg Le - #u Thr Val Thr Ile Lys           #       220                                                                   - AGG TTC CCG TAC CCG CCG TTC ATC GCA GAC CC - #C TTC CTC GTG GCC ATC          718                                                                          Arg Phe Pro Tyr Pro Pro Phe Ile Ala Asp Pr - #o Phe Leu Val Ala Ile           #   235                                                                       - CAG TAC CAG CTG CCC CTG CTG CTG CTG CTC AG - #C TTC ACC TAC ACC GCG          766                                                                          Gln Tyr Gln Leu Pro Leu Leu Leu Leu Leu Se - #r Phe Thr Tyr Thr Ala           240                 2 - #45                 2 - #50                 2 -       #55                                                                           - CTC ACC ATT GCC CGT GCT GTC GTG CAG GAG AA - #G GAA AGG AGG CTG AAG          814                                                                          Leu Thr Ile Ala Arg Ala Val Val Gln Glu Ly - #s Glu Arg Arg Leu Lys           #               270                                                           - GAG TAC ATG CGC ATG ATG GGG CTC AGC AGC TG - #G CTG CAC TGG AGT GCC          862                                                                          Glu Tyr Met Arg Met Met Gly Leu Ser Ser Tr - #p Leu His Trp Ser Ala           #           285                                                               - TGG TTC CTC TTG TTC TTC CTC TTC CTC CTC AT - #C GCC GCC TCC TTC ATG          910                                                                          Trp Phe Leu Leu Phe Phe Leu Phe Leu Leu Il - #e Ala Ala Ser Phe Met           #       300                                                                   - ACC CTG CTC TTC TGT GTC AAG GTG AAG CCA AA - #T GTA GCC GTG CTG TCC          958                                                                          Thr Leu Leu Phe Cys Val Lys Val Lys Pro As - #n Val Ala Val Leu Ser           #   315                                                                       - CGC AGC GAC CCC TCC CTG GTG CTC GCC TTC CT - #G CTG TGC TTC GCC ATC         1006                                                                          Arg Ser Asp Pro Ser Leu Val Leu Ala Phe Le - #u Leu Cys Phe Ala Ile           320                 3 - #25                 3 - #30                 3 -       #35                                                                           - TCT ACC ATC TCC TTC AGC TTC ATG GTC AGC AC - #C TTC TTC AGC AAA GCC         1054                                                                          Ser Thr Ile Ser Phe Ser Phe Met Val Ser Th - #r Phe Phe Ser Lys Ala           #               350                                                           - AAC ATG GCA GCA GCC TTC GGA GGC TTC CTC TA - #C TTC TTC ACC TAC ATC         1102                                                                          Asn Met Ala Ala Ala Phe Gly Gly Phe Leu Ty - #r Phe Phe Thr Tyr Ile           #           365                                                               - CCC TAC TTC TTC GTG GCC CCT CGG TAC AAC TG - #G ATG ACT CTG AGC CAG         1150                                                                          Pro Tyr Phe Phe Val Ala Pro Arg Tyr Asn Tr - #p Met Thr Leu Ser Gln           #       380                                                                   - AAG CTC TGC TCC TGC CTC CTG TCT AAT GTC GC - #C ATG GCA ATG GGA GCC         1198                                                                          Lys Leu Cys Ser Cys Leu Leu Ser Asn Val Al - #a Met Ala Met Gly Ala           #   395                                                                       - CAG CTC ATT GGG AAA TTT GAG GCG AAA GGC AT - #G GGC ATC CAG TGG CGA         1246                                                                          Gln Leu Ile Gly Lys Phe Glu Ala Lys Gly Me - #t Gly Ile Gln Trp Arg           400                 4 - #05                 4 - #10                 4 -       #15                                                                           - GAC CTC CTG AGT CCC GTC AAC GTG GAC GAC GA - #C TTC TGC TTC GGG CAG         1294                                                                          Asp Leu Leu Ser Pro Val Asn Val Asp Asp As - #p Phe Cys Phe Gly Gln           #               430                                                           - GTG CTG GGG ATG CTG CTG CTG GAC TCT GTG CT - #C TAT GGC CTG GTG ACC         1342                                                                          Val Leu Gly Met Leu Leu Leu Asp Ser Val Le - #u Tyr Gly Leu Val Thr           #           445                                                               - TGG TAC ATG GAG GCC GTC TTC CCA GGG CAG TT - #C GGC GTG CCT CAG CCC         1390                                                                          Trp Tyr Met Glu Ala Val Phe Pro Gly Gln Ph - #e Gly Val Pro Gln Pro           #       460                                                                   - TGG TAC TTC TTC ATC ATG CCC TCC TAT TGG TG - #T GGG AAG CCA AGG GCG         1438                                                                          Trp Tyr Phe Phe Ile Met Pro Ser Tyr Trp Cy - #s Gly Lys Pro Arg Ala           #   475                                                                       - GTT GCA GGG AAG GAG GAA GAA GAC AGT GAC CC - #C GAG AAA GCA CTC AGA         1486                                                                          Val Ala Gly Lys Glu Glu Glu Asp Ser Asp Pr - #o Glu Lys Ala Leu Arg           480                 4 - #85                 4 - #90                 4 -       #95                                                                           - AAC GAG TAC TTT GAA GCC GAG CCA GAG GAC CT - #G GTG GCG GGG ATC AAG         1534                                                                          Asn Glu Tyr Phe Glu Ala Glu Pro Glu Asp Le - #u Val Ala Gly Ile Lys           #               510                                                           - ATC AAG CAC CTG TCC AAG GTG TTC AGG GTG GG - #A AAT AAG GAC AGG GCG         1582                                                                          Ile Lys His Leu Ser Lys Val Phe Arg Val Gl - #y Asn Lys Asp Arg Ala           #           525                                                               - GCC GTC AGA GAC CTG AAC CTC AAC CTG TAC GA - #G GGA CAG ATC ACC GTC         1630                                                                          Ala Val Arg Asp Leu Asn Leu Asn Leu Tyr Gl - #u Gly Gln Ile Thr Val           #       540                                                                   - CTG CTG GGC CAC AAC GGT GCC GGG AAG ACC AC - #C ACC CTC TCC ATG CTC         1678                                                                          Leu Leu Gly His Asn Gly Ala Gly Lys Thr Th - #r Thr Leu Ser Met Leu           #   555                                                                       - ACA GGT CTC TTT CCC CCC ACC AGT GGA CGG GC - #A TAC ATC AGC GGG TAT         1726                                                                          Thr Gly Leu Phe Pro Pro Thr Ser Gly Arg Al - #a Tyr Ile Ser Gly Tyr           560                 5 - #65                 5 - #70                 5 -       #75                                                                           - GAA ATT TCC CAG GAC ATG GTT CAG ATC CGG AA - #G AGC CTG GGC CTG TGC         1774                                                                          Glu Ile Ser Gln Asp Met Val Gln Ile Arg Ly - #s Ser Leu Gly Leu Cys           #               590                                                           - CCG CAG CAC GAC ATC CTG TTT GAC AAC TTG AC - #A GTC GCA GAG CAC CTT         1822                                                                          Pro Gln His Asp Ile Leu Phe Asp Asn Leu Th - #r Val Ala Glu His Leu           #           605                                                               - TAT TTC TAC GCC CAG CTG AAG GGC CTG TCA CG - #T CAG AAG TGC CCT GAA         1870                                                                          Tyr Phe Tyr Ala Gln Leu Lys Gly Leu Ser Ar - #g Gln Lys Cys Pro Glu           #       620                                                                   - GAA GTC AAG CAG ATG CTG CAC ATC ATC GGC CT - #G GAG GAC AAG TGG AAC         1918                                                                          Glu Val Lys Gln Met Leu His Ile Ile Gly Le - #u Glu Asp Lys Trp Asn           #   635                                                                       - TCA CGG AGC CGC TTC CTG AGC GGG GGC ATG AG - #G CGC AAG CTC TCC ATC         1966                                                                          Ser Arg Ser Arg Phe Leu Ser Gly Gly Met Ar - #g Arg Lys Leu Ser Ile           640                 6 - #45                 6 - #50                 6 -       #55                                                                           - GGC ATC GCC CTC ATC GCA GGC TCC AAG GTG CT - #G ATA CTG GAC GAG CCC         2014                                                                          Gly Ile Ala Leu Ile Ala Gly Ser Lys Val Le - #u Ile Leu Asp Glu Pro           #               670                                                           - ACC TCG GGC ATG GAC GCC ATC TCC AGG AGG GC - #C ATC TGG GAT CTT CTT         2062                                                                          Thr Ser Gly Met Asp Ala Ile Ser Arg Arg Al - #a Ile Trp Asp Leu Leu           #           685                                                               - CAG CGG CAG AAA AGT GAC CGC ACC ATC GTG CT - #G ACC ACC CAC TTC ATG         2110                                                                          Gln Arg Gln Lys Ser Asp Arg Thr Ile Val Le - #u Thr Thr His Phe Met           #       700                                                                   - GAC GAG GCT GAC CTG CTG GGA GAC CGC ATC GC - #C ATC ATG GCC AAG GGG         2158                                                                          Asp Glu Ala Asp Leu Leu Gly Asp Arg Ile Al - #a Ile Met Ala Lys Gly           #   715                                                                       - GAG CTG CAG TGC TGC GGG TCC TCG CTG TTC CT - #C AAG CAG AAA TAC GGT         2206                                                                          Glu Leu Gln Cys Cys Gly Ser Ser Leu Phe Le - #u Lys Gln Lys Tyr Gly           720                 7 - #25                 7 - #30                 7 -       #35                                                                           - GCC GGC TAT CAC ATG ACG CTG GTG AAG GAG CC - #G CAC TGC AAC CCG GAA         2254                                                                          Ala Gly Tyr His Met Thr Leu Val Lys Glu Pr - #o His Cys Asn Pro Glu           #               750                                                           - GAC ATC TCC CAG CTG GTC CAC CAC CAC GTG CC - #C AAC GCC ACG CTG GAG         2302                                                                          Asp Ile Ser Gln Leu Val His His His Val Pr - #o Asn Ala Thr Leu Glu           #           765                                                               - AGC AGC GCT GGG GCC GAG CTG TCT TTC ATC CT - #T CCC AGA GAG AGC ACG         2350                                                                          Ser Ser Ala Gly Ala Glu Leu Ser Phe Ile Le - #u Pro Arg Glu Ser Thr           #       780                                                                   - CAC AGG TTT GAA GGT CTC TTT GCT AAA CTG GA - #G AAG AAG CAG AAA GAG         2398                                                                          His Arg Phe Glu Gly Leu Phe Ala Lys Leu Gl - #u Lys Lys Gln Lys Glu           #   795                                                                       - CTG GGC ATT GCC AGC TTT GGG GCA TCC ATC AC - #C ACC ATG GAG GAA GTC         2446                                                                          Leu Gly Ile Ala Ser Phe Gly Ala Ser Ile Th - #r Thr Met Glu Glu Val           800                 8 - #05                 8 - #10                 8 -       #15                                                                           - TTC CTT CGG GTC GGG AAG CTG GTG GAC AGC AG - #T ATG GAC ATC CAG GCC         2494                                                                          Phe Leu Arg Val Gly Lys Leu Val Asp Ser Se - #r Met Asp Ile Gln Ala           #               830                                                           - ATC CAG CTC CCT GCC CTG CAG TAC CAG CAC GA - #G AGG CGC GCC AGC GAC         2542                                                                          Ile Gln Leu Pro Ala Leu Gln Tyr Gln His Gl - #u Arg Arg Ala Ser Asp           #           845                                                               - TGG GCT GTG GAC AGC AAC CTC TGT GGG GCC AT - #G GAC CCC TCC GAC GGC         2590                                                                          Trp Ala Val Asp Ser Asn Leu Cys Gly Ala Me - #t Asp Pro Ser Asp Gly           #       860                                                                   - ATT GGA GCC CTC ATC GAG GAG GAG CGC ACC GC - #T GTC AAG CTC AAC ACT         2638                                                                          Ile Gly Ala Leu Ile Glu Glu Glu Arg Thr Al - #a Val Lys Leu Asn Thr           #   875                                                                       - GGG CTC GCC CTG CAC TGC CAG CAA TTC TGG GC - #C ATG TTC CTG AAG AAG         2686                                                                          Gly Leu Ala Leu His Cys Gln Gln Phe Trp Al - #a Met Phe Leu Lys Lys           880                 8 - #85                 8 - #90                 8 -       #95                                                                           - GCC GCA TAC AGC TGG CGC GAG TGG AAA ATG GT - #G GCG GCA CAG GTC CTG         2734                                                                          Ala Ala Tyr Ser Trp Arg Glu Trp Lys Met Va - #l Ala Ala Gln Val Leu           #               910                                                           - GTG CCT CTG ACC TGC GTC ACC CTG GCC CTC CT - #G GCC ATC AAC TAC TCC         2782                                                                          Val Pro Leu Thr Cys Val Thr Leu Ala Leu Le - #u Ala Ile Asn Tyr Ser           #           925                                                               - TCG GAG CTC TTC GAC GAC CCC ATG CTG AGG CT - #G ACC TTG GGC GAG TAC         2830                                                                          Ser Glu Leu Phe Asp Asp Pro Met Leu Arg Le - #u Thr Leu Gly Glu Tyr           #       940                                                                   - GGC AGA ACC GTC GTG CCC TTC TCA GTT CCC GG - #G ACC TCC CAG CTG GGT         2878                                                                          Gly Arg Thr Val Val Pro Phe Ser Val Pro Gl - #y Thr Ser Gln Leu Gly           #   955                                                                       - CAG CAG CTG TCA GAG CAT CTG AAA GAC GCA CT - #G CAG GCT GAG GGA CAG         2926                                                                          Gln Gln Leu Ser Glu His Leu Lys Asp Ala Le - #u Gln Ala Glu Gly Gln           960                 9 - #65                 9 - #70                 9 -       #75                                                                           - GAG CCC CGC GAG GTG CTC GGT GAC CTG GAG GA - #G TTC TTG ATC TTC AGG         2974                                                                          Glu Pro Arg Glu Val Leu Gly Asp Leu Glu Gl - #u Phe Leu Ile Phe Arg           #               990                                                           - GCT TCT GTG GAG GGG GGC GGC TTT AAT GAG CG - #G TGC CTT GTG GCA GCG         3022                                                                          Ala Ser Val Glu Gly Gly Gly Phe Asn Glu Ar - #g Cys Leu Val Ala Ala           #          10050                                                              - TCC TTC AGA GAT GTG GGA GAG CGC ACG GTC GT - #C AAC GCC TTG TTC AAC         3070                                                                          Ser Phe Arg Asp Val Gly Glu Arg Thr Val Va - #l Asn Ala Leu Phe Asn           #      10205                                                                  - AAC CAG GCG TAC CAC TCT CCA GCC ACT GCC CT - #G GCC GTC GTG GAC AAC         3118                                                                          Asn Gln Ala Tyr His Ser Pro Ala Thr Ala Le - #u Ala Val Val Asp Asn           #  10350                                                                      - CTT CTG TTC AAG CTG CTG TGC GGG CCT CAC GC - #C TCC ATT GTG GTC TCC         3166                                                                          Leu Leu Phe Lys Leu Leu Cys Gly Pro His Al - #a Ser Ile Val Val Ser           #               10551045 - #                1050                              - AAC TTC CCC CAG CCC CGG AGC GCC CTG CAG GC - #T GCC AAG GAC CAG TTT         3214                                                                          Asn Phe Pro Gln Pro Arg Ser Ala Leu Gln Al - #a Ala Lys Asp Gln Phe           #              10705                                                          - AAC GAG GGC CGG AAG GGA TTC GAC ATT GCC CT - #C AAC CTG CTC TTC GCC         3262                                                                          Asn Glu Gly Arg Lys Gly Phe Asp Ile Ala Le - #u Asn Leu Leu Phe Ala           #          10850                                                              - ATG GCA TTC TTG GCC AGC ACG TTC TCC ATC CT - #G GCG GTC AGC GAG AGG         3310                                                                          Met Ala Phe Leu Ala Ser Thr Phe Ser Ile Le - #u Ala Val Ser Glu Arg           #      11005                                                                  - GCC GTG CAG GCC AAG CAT GTG CAG TTT GTG AG - #T GGA GTC CAC GTG GCC         3358                                                                          Ala Val Gln Ala Lys His Val Gln Phe Val Se - #r Gly Val His Val Ala           #  11150                                                                      - AGT TTC TGG CTC TCT GCT CTG CTG TGG GAC CT - #C ATC TCC TTC CTC ATC         3406                                                                          Ser Phe Trp Leu Ser Ala Leu Leu Trp Asp Le - #u Ile Ser Phe Leu Ile           #               11351125 - #                1130                              - CCC AGT CTG CTG CTG CTG GTG GTG TTT AAG GC - #C TTC GAC GTG CGT GCC         3454                                                                          Pro Ser Leu Leu Leu Leu Val Val Phe Lys Al - #a Phe Asp Val Arg Ala           #              11505                                                          - TTC ACG CGG GAC GGC CAC ATG GCT GAC ACC CT - #G CTG CTG CTC CTG CTC         3502                                                                          Phe Thr Arg Asp Gly His Met Ala Asp Thr Le - #u Leu Leu Leu Leu Leu           #          11650                                                              - TAC GGC TGG GCC ATC ATC CCC CTC ATG TAC CT - #G ATG AAC TTC TTC TTC         3550                                                                          Tyr Gly Trp Ala Ile Ile Pro Leu Met Tyr Le - #u Met Asn Phe Phe Phe           #      11805                                                                  - TTG GGG GCG GCC ACT GCC TAC ACG AGG CTG AC - #C ATC TTC AAC ATC CTG         3598                                                                          Leu Gly Ala Ala Thr Ala Tyr Thr Arg Leu Th - #r Ile Phe Asn Ile Leu           #  11950                                                                      - TCA GGC ATC GCC ACC TTC CTG ATG GTC ACC AT - #C ATG CGC ATC CCA GCT         3646                                                                          Ser Gly Ile Ala Thr Phe Leu Met Val Thr Il - #e Met Arg Ile Pro Ala           #               12151205 - #                1210                              - GTA AAA CTG GAA GAA CTT TCC AAA ACC CTG GA - #T CAC GTG TTC CTG GTG         3694                                                                          Val Lys Leu Glu Glu Leu Ser Lys Thr Leu As - #p His Val Phe Leu Val           #              12305                                                          - CTG CCC AAC CAC TGT CTG GGG ATG GCA GTC AG - #C AGT TTC TAC GAG AAC         3742                                                                          Leu Pro Asn His Cys Leu Gly Met Ala Val Se - #r Ser Phe Tyr Glu Asn           #          12450                                                              - TAC GAG ACG CGG AGG TAC TGC ACC TCC TCC GA - #G GTC GCC GCC CAC TAC         3790                                                                          Tyr Glu Thr Arg Arg Tyr Cys Thr Ser Ser Gl - #u Val Ala Ala His Tyr           #      12605                                                                  - TGC AAG AAA TAT AAC ATC CAG TAC CAG GAG AA - #C TTC TAT GCC TGG AGC         3838                                                                          Cys Lys Lys Tyr Asn Ile Gln Tyr Gln Glu As - #n Phe Tyr Ala Trp Ser           #  12750                                                                      - GCC CCG GGG GTC GGC CGG TTT GTG GCC TCC AT - #G GCC GCC TCA GGG TGC         3886                                                                          Ala Pro Gly Val Gly Arg Phe Val Ala Ser Me - #t Ala Ala Ser Gly Cys           #               12951285 - #                1290                              - GCC TAC CTC ATC CTG CTC TTC CTC ATC GAG AC - #C AAC CTG CTT CAG AGA         3934                                                                          Ala Tyr Leu Ile Leu Leu Phe Leu Ile Glu Th - #r Asn Leu Leu Gln Arg           #              13105                                                          - CTC AGG GGC ATC CTC TGC GCC CTC CGG AGG AG - #G CGG ACA CTG ACA GAA         3982                                                                          Leu Arg Gly Ile Leu Cys Ala Leu Arg Arg Ar - #g Arg Thr Leu Thr Glu           #          13250                                                              - TTA TAC ACC CGG ATG CCT GTG CTT CCT GAG GA - #C CAA GAT GTA GCG GAC         4030                                                                          Leu Tyr Thr Arg Met Pro Val Leu Pro Glu As - #p Gln Asp Val Ala Asp           #      13405                                                                  - GAG AGG ACC CGC ATC CTG GCC CCC AGC CCG GA - #C TCC CTG CTC CAC ACA         4078                                                                          Glu Arg Thr Arg Ile Leu Ala Pro Ser Pro As - #p Ser Leu Leu His Thr           #  13550                                                                      - CCT CTG ATT ATC AAG GAG CTC TCC AAG GTG TA - #C GAG CAG CGG GTG CCC         4126                                                                          Pro Leu Ile Ile Lys Glu Leu Ser Lys Val Ty - #r Glu Gln Arg Val Pro           #               13751365 - #                1370                              - CTC CTG GCC GTG GAC AGG CTC TCC CTC GCG GT - #G CAG AAA GGG GAG TGC         4174                                                                          Leu Leu Ala Val Asp Arg Leu Ser Leu Ala Va - #l Gln Lys Gly Glu Cys           #              13905                                                          - TTC GGC CTG CTG GGC TTC AAT GGA GCC GGG AA - #G ACC ACG ACT TTC AAA         4222                                                                          Phe Gly Leu Leu Gly Phe Asn Gly Ala Gly Ly - #s Thr Thr Thr Phe Lys           #          14050                                                              - ATG CTG ACC GGG GAG GAG AGC CTC ACT TCT GG - #G GAT GCC TTT GTC GGG         4270                                                                          Met Leu Thr Gly Glu Glu Ser Leu Thr Ser Gl - #y Asp Ala Phe Val Gly           #      14205                                                                  - GGT CAC AGA ATC AGC TCT GAT GTC GGA AAG GT - #G CGG CAG CGG ATC GGC         4318                                                                          Gly His Arg Ile Ser Ser Asp Val Gly Lys Va - #l Arg Gln Arg Ile Gly           #  14350                                                                      - TAC TGC CCG CAG TTT GAT GCC TTG CTG GAC CA - #C ATG ACA GGC CGG GAG         4366                                                                          Tyr Cys Pro Gln Phe Asp Ala Leu Leu Asp Hi - #s Met Thr Gly Arg Glu           #               14551445 - #                1450                              - ATG CTG GTC ATG TAC GCT CGG CTC CGG GGC AT - #C CCT GAG CGC CAC ATC         4414                                                                          Met Leu Val Met Tyr Ala Arg Leu Arg Gly Il - #e Pro Glu Arg His Ile           #              14705                                                          - GGG GCC TGC GTG GAG AAC ACT CTG CGG GGC CT - #G CTG CTG GAG CCA CAT         4462                                                                          Gly Ala Cys Val Glu Asn Thr Leu Arg Gly Le - #u Leu Leu Glu Pro His           #          14850                                                              - GCC AAC AAG CTG GTC AGG ACG TAC AGT GGT GG - #T AAC AAG CGG AAG CTG         4510                                                                          Ala Asn Lys Leu Val Arg Thr Tyr Ser Gly Gl - #y Asn Lys Arg Lys Leu           #      15005                                                                  - AGC ACC GGC ATC GCC CTG ATC GGA GAG CCT GC - #T GTC ATC TTC CTG GAC         4558                                                                          Ser Thr Gly Ile Ala Leu Ile Gly Glu Pro Al - #a Val Ile Phe Leu Asp           #  15150                                                                      - GAG CCG TCC ACT GGC ATG GAC CCC GTG GCC CG - #G CGC CTG CTT TGG GAC         4606                                                                          Glu Pro Ser Thr Gly Met Asp Pro Val Ala Ar - #g Arg Leu Leu Trp Asp           #               15351525 - #                1530                              - ACC GTG GCA CGA GCC CGA GAG TCT GGC AAG GC - #C ATC ATC ATC ACC TCC         4654                                                                          Thr Val Ala Arg Ala Arg Glu Ser Gly Lys Al - #a Ile Ile Ile Thr Ser           #              15505                                                          - CAC AGC ATG GAG GAG TGT GAG GCC CTG TGC AC - #C CGG CTG GCC ATC ATG         4702                                                                          His Ser Met Glu Glu Cys Glu Ala Leu Cys Th - #r Arg Leu Ala Ile Met           #          15650                                                              - GTG CAG GGG CAG TTC AAG TGC CTG GGC AGC CC - #C CAG CAC CTC AAG AGC         4750                                                                          Val Gln Gly Gln Phe Lys Cys Leu Gly Ser Pr - #o Gln His Leu Lys Ser           #      15805                                                                  - AAG TTC GGC AGC GGC TAC TCC CTG CGG GCC AA - #G GTG CAG AGT GAA GGG         4798                                                                          Lys Phe Gly Ser Gly Tyr Ser Leu Arg Ala Ly - #s Val Gln Ser Glu Gly           #  15950                                                                      - CAA CAG GAG GCG CTG GAG GAG TTC AAG GCC TT - #C GTG GAC CTG ACC TTT         4846                                                                          Gln Gln Glu Ala Leu Glu Glu Phe Lys Ala Ph - #e Val Asp Leu Thr Phe           #               16151605 - #                1610                              - CCA GGC AGC GTC CTG GAA GAT GAG CAC CAA GG - #C ATG GTC CAT TAC CAC         4894                                                                          Pro Gly Ser Val Leu Glu Asp Glu His Gln Gl - #y Met Val His Tyr His           #              16305                                                          - CTG CCG GGC CGT GAC CTC AGC TGG GCG AAG GT - #T TTC GGT ATT CTG GAG         4942                                                                          Leu Pro Gly Arg Asp Leu Ser Trp Ala Lys Va - #l Phe Gly Ile Leu Glu           #          16450                                                              - AAA GCC AAG GAA AAG TAC GGC GTG GAC GAC TA - #C TCC GTG AGC CAG ATC         4990                                                                          Lys Ala Lys Glu Lys Tyr Gly Val Asp Asp Ty - #r Ser Val Ser Gln Ile           #      16605                                                                  - TCG CTG GAA CAG GTC TTC CTG AGC TTC GCC CA - #C CTG CAG CCG CCC ACC         5038                                                                          Ser Leu Glu Gln Val Phe Leu Ser Phe Ala Hi - #s Leu Gln Pro Pro Thr           #  16750                                                                      - GCA GAG GAG GGG CGA TGAGGGGTGG CGGCTGTCTC GCCATCAGG - #C AGGGACAGGA         5093                                                                          Ala Glu Glu Gly Arg                                                           1680                                                                          - CGGGCAAGCA GGGCCCATCT TACATCCTCT CTCTCCAAGT TTATCTCATC CT - #TTATTTTT       5153                                                                          - AATCACTTTT TTCTATGATG GATATGAAAA ATTCAAGGCA GTATGCACAG AA - #TGGACGAG       5213                                                                          - TGCAGCCCAG CCCTCATGCC CAGGATCAGC ATGCGCATCT CCATGTCTGC AT - #ACTCTGGA       5273                                                                          - GTTCACTTTC CCAGAGCTGG GGCAGGCCGG GCAGTCTGCG GGCAAGCTCC GG - #GGTCTCTG       5333                                                                          - GGTGGAGAGC TGACCCAGGA AGGGCTGCAG CTGAGCTGGG GGTTGAATTT CT - #CCAGGCAC       5393                                                                          - TCCCTGGAGA GAGGACCCAG TGACTTGTCC AAGTTTACAC ACGACACTAA TC - #TCCCCTGG       5453                                                                          - GGAGGAAGCG GGAAGCCAGC CAGGTTGAAC TGTAGCGAGG CCCCCAGGCC GC - #CAGGAATG       5513                                                                          - GACCATGCAG ATCACTGTCA GTGGAGGGAA GCTGCTGACT GTGATTAGGT GC - #TGGGGTCT       5573                                                                          - TAGCGTCCAG CGCAGCCCGG GGGCATCCTG GAGGCTCTGC TCCTTAGGGC AT - #GGTAGTCA       5633                                                                          - CCGCGAAGCC GGGCACCGTC CCACAGCATC TCCTAGAAGC AGCCGGCACA GG - #AGGGAAGG       5693                                                                          - TGGCCAGGCT CGAAGCAGTC TCTGTTTCCA GCACTGCACC CTCAGGAAGT CG - #CCCGCCCC       5753                                                                          - AGGACACGCA GGGACCACCC TAAGGGCTGG GTGGCTGTCT CAAGGACACA TT - #GAATACGT       5813                                                                          - TGTGACCATC CAGAAAATAA ATGCTGAGGG GACACAAAAA AAAAAAAAAA AA - #AAAAAAAA       5873                                                                          #                5894AA A                                                     - (2) INFORMATION FOR SEQ ID NO:25:                                           -      (i) SEQUENCE CHARACTERISTICS:                                          #acids    (A) LENGTH: 1684 amino                                                        (B) TYPE: amino acid                                                          (D) TOPOLOGY: linear                                                -     (ii) MOLECULE TYPE: protein                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:25:                                - Lys Val Leu Val Thr Val Leu Glu Leu Phe Le - #u Pro Leu Leu Phe Ser         #                 15                                                          - Gly Ile Leu Ile Trp Leu Arg Leu Lys Ile Gl - #n Ser Glu Asn Val Pro         #             30                                                              - Asn Ala Thr Ile Tyr Pro Gly Gln Ser Ile Gl - #n Glu Leu Pro Leu Phe         #         45                                                                  - Phe Thr Phe Pro Pro Pro Gly Asp Thr Trp Gl - #u Leu Ala Tyr Ile Pro         #     60                                                                      - Ser His Ser Asp Ala Ala Lys Ala Val Thr Gl - #u Thr Val Arg Arg Ala         # 80                                                                          - Leu Val Ile Asn Met Arg Val Arg Gly Phe Pr - #o Ser Glu Lys Asp Phe         #                 95                                                          - Glu Asp Tyr Ile Arg Tyr Asp Asn Cys Ser Se - #r Ser Val Leu Ala Ala         #           110                                                               - Val Val Phe Glu His Pro Phe Asn His Ser Ly - #s Glu Pro Leu Pro Leu         #       125                                                                   - Ala Val Lys Tyr His Leu Arg Phe Ser Tyr Th - #r Arg Arg Asn Tyr Met         #   140                                                                       - Trp Thr Gln Thr Gly Ser Phe Phe Leu Lys Gl - #u Thr Glu Gly Trp His         145                 1 - #50                 1 - #55                 1 -       #60                                                                           - Thr Thr Ser Leu Phe Pro Leu Phe Pro Asn Pr - #o Gly Pro Arg Glu Leu         #               175                                                           - Thr Ser Pro Asp Gly Gly Glu Pro Gly Tyr Il - #e Arg Glu Gly Phe Leu         #           190                                                               - Ala Val Gln His Ala Val Asp Arg Ala Ile Me - #t Glu Tyr His Ala Asp         #       205                                                                   - Ala Ala Thr Arg Gln Leu Phe Gln Arg Leu Th - #r Val Thr Ile Lys Arg         #   220                                                                       - Phe Pro Tyr Pro Pro Phe Ile Ala Asp Pro Ph - #e Leu Val Ala Ile Gln         225                 2 - #30                 2 - #35                 2 -       #40                                                                           - Tyr Gln Leu Pro Leu Leu Leu Leu Leu Ser Ph - #e Thr Tyr Thr Ala Leu         #               255                                                           - Thr Ile Ala Arg Ala Val Val Gln Glu Lys Gl - #u Arg Arg Leu Lys Glu         #           270                                                               - Tyr Met Arg Met Met Gly Leu Ser Ser Trp Le - #u His Trp Ser Ala Trp         #       285                                                                   - Phe Leu Leu Phe Phe Leu Phe Leu Leu Ile Al - #a Ala Ser Phe Met Thr         #   300                                                                       - Leu Leu Phe Cys Val Lys Val Lys Pro Asn Va - #l Ala Val Leu Ser Arg         305                 3 - #10                 3 - #15                 3 -       #20                                                                           - Ser Asp Pro Ser Leu Val Leu Ala Phe Leu Le - #u Cys Phe Ala Ile Ser         #               335                                                           - Thr Ile Ser Phe Ser Phe Met Val Ser Thr Ph - #e Phe Ser Lys Ala Asn         #           350                                                               - Met Ala Ala Ala Phe Gly Gly Phe Leu Tyr Ph - #e Phe Thr Tyr Ile Pro         #       365                                                                   - Tyr Phe Phe Val Ala Pro Arg Tyr Asn Trp Me - #t Thr Leu Ser Gln Lys         #   380                                                                       - Leu Cys Ser Cys Leu Leu Ser Asn Val Ala Me - #t Ala Met Gly Ala Gln         385                 3 - #90                 3 - #95                 4 -       #00                                                                           - Leu Ile Gly Lys Phe Glu Ala Lys Gly Met Gl - #y Ile Gln Trp Arg Asp         #               415                                                           - Leu Leu Ser Pro Val Asn Val Asp Asp Asp Ph - #e Cys Phe Gly Gln Val         #           430                                                               - Leu Gly Met Leu Leu Leu Asp Ser Val Leu Ty - #r Gly Leu Val Thr Trp         #       445                                                                   - Tyr Met Glu Ala Val Phe Pro Gly Gln Phe Gl - #y Val Pro Gln Pro Trp         #   460                                                                       - Tyr Phe Phe Ile Met Pro Ser Tyr Trp Cys Gl - #y Lys Pro Arg Ala Val         465                 4 - #70                 4 - #75                 4 -       #80                                                                           - Ala Gly Lys Glu Glu Glu Asp Ser Asp Pro Gl - #u Lys Ala Leu Arg Asn         #               495                                                           - Glu Tyr Phe Glu Ala Glu Pro Glu Asp Leu Va - #l Ala Gly Ile Lys Ile         #           510                                                               - Lys His Leu Ser Lys Val Phe Arg Val Gly As - #n Lys Asp Arg Ala Ala         #       525                                                                   - Val Arg Asp Leu Asn Leu Asn Leu Tyr Glu Gl - #y Gln Ile Thr Val Leu         #   540                                                                       - Leu Gly His Asn Gly Ala Gly Lys Thr Thr Th - #r Leu Ser Met Leu Thr         545                 5 - #50                 5 - #55                 5 -       #60                                                                           - Gly Leu Phe Pro Pro Thr Ser Gly Arg Ala Ty - #r Ile Ser Gly Tyr Glu         #               575                                                           - Ile Ser Gln Asp Met Val Gln Ile Arg Lys Se - #r Leu Gly Leu Cys Pro         #           590                                                               - Gln His Asp Ile Leu Phe Asp Asn Leu Thr Va - #l Ala Glu His Leu Tyr         #       605                                                                   - Phe Tyr Ala Gln Leu Lys Gly Leu Ser Arg Gl - #n Lys Cys Pro Glu Glu         #   620                                                                       - Val Lys Gln Met Leu His Ile Ile Gly Leu Gl - #u Asp Lys Trp Asn Ser         625                 6 - #30                 6 - #35                 6 -       #40                                                                           - Arg Ser Arg Phe Leu Ser Gly Gly Met Arg Ar - #g Lys Leu Ser Ile Gly         #               655                                                           - Ile Ala Leu Ile Ala Gly Ser Lys Val Leu Il - #e Leu Asp Glu Pro Thr         #           670                                                               - Ser Gly Met Asp Ala Ile Ser Arg Arg Ala Il - #e Trp Asp Leu Leu Gln         #       685                                                                   - Arg Gln Lys Ser Asp Arg Thr Ile Val Leu Th - #r Thr His Phe Met Asp         #   700                                                                       - Glu Ala Asp Leu Leu Gly Asp Arg Ile Ala Il - #e Met Ala Lys Gly Glu         705                 7 - #10                 7 - #15                 7 -       #20                                                                           - Leu Gln Cys Cys Gly Ser Ser Leu Phe Leu Ly - #s Gln Lys Tyr Gly Ala         #               735                                                           - Gly Tyr His Met Thr Leu Val Lys Glu Pro Hi - #s Cys Asn Pro Glu Asp         #           750                                                               - Ile Ser Gln Leu Val His His His Val Pro As - #n Ala Thr Leu Glu Ser         #       765                                                                   - Ser Ala Gly Ala Glu Leu Ser Phe Ile Leu Pr - #o Arg Glu Ser Thr His         #   780                                                                       - Arg Phe Glu Gly Leu Phe Ala Lys Leu Glu Ly - #s Lys Gln Lys Glu Leu         785                 7 - #90                 7 - #95                 8 -       #00                                                                           - Gly Ile Ala Ser Phe Gly Ala Ser Ile Thr Th - #r Met Glu Glu Val Phe         #               815                                                           - Leu Arg Val Gly Lys Leu Val Asp Ser Ser Me - #t Asp Ile Gln Ala Ile         #           830                                                               - Gln Leu Pro Ala Leu Gln Tyr Gln His Glu Ar - #g Arg Ala Ser Asp Trp         #       845                                                                   - Ala Val Asp Ser Asn Leu Cys Gly Ala Met As - #p Pro Ser Asp Gly Ile         #   860                                                                       - Gly Ala Leu Ile Glu Glu Glu Arg Thr Ala Va - #l Lys Leu Asn Thr Gly         865                 8 - #70                 8 - #75                 8 -       #80                                                                           - Leu Ala Leu His Cys Gln Gln Phe Trp Ala Me - #t Phe Leu Lys Lys Ala         #               895                                                           - Ala Tyr Ser Trp Arg Glu Trp Lys Met Val Al - #a Ala Gln Val Leu Val         #           910                                                               - Pro Leu Thr Cys Val Thr Leu Ala Leu Leu Al - #a Ile Asn Tyr Ser Ser         #       925                                                                   - Glu Leu Phe Asp Asp Pro Met Leu Arg Leu Th - #r Leu Gly Glu Tyr Gly         #   940                                                                       - Arg Thr Val Val Pro Phe Ser Val Pro Gly Th - #r Ser Gln Leu Gly Gln         945                 9 - #50                 9 - #55                 9 -       #60                                                                           - Gln Leu Ser Glu His Leu Lys Asp Ala Leu Gl - #n Ala Glu Gly Gln Glu         #               975                                                           - Pro Arg Glu Val Leu Gly Asp Leu Glu Glu Ph - #e Leu Ile Phe Arg Ala         #           990                                                               - Ser Val Glu Gly Gly Gly Phe Asn Glu Arg Cy - #s Leu Val Ala Ala Ser         #      10050                                                                  - Phe Arg Asp Val Gly Glu Arg Thr Val Val As - #n Ala Leu Phe Asn Asn         #  10205                                                                      - Gln Ala Tyr His Ser Pro Ala Thr Ala Leu Al - #a Val Val Asp Asn Leu         #               10401030 - #                1035                              - Leu Phe Lys Leu Leu Cys Gly Pro His Ala Se - #r Ile Val Val Ser Asn         #              10550                                                          - Phe Pro Gln Pro Arg Ser Ala Leu Gln Ala Al - #a Lys Asp Gln Phe Asn         #          10705                                                              - Glu Gly Arg Lys Gly Phe Asp Ile Ala Leu As - #n Leu Leu Phe Ala Met         #      10850                                                                  - Ala Phe Leu Ala Ser Thr Phe Ser Ile Leu Al - #a Val Ser Glu Arg Ala         #  11005                                                                      - Val Gln Ala Lys His Val Gln Phe Val Ser Gl - #y Val His Val Ala Ser         #               11201110 - #                1115                              - Phe Trp Leu Ser Ala Leu Leu Trp Asp Leu Il - #e Ser Phe Leu Ile Pro         #              11350                                                          - Ser Leu Leu Leu Leu Val Val Phe Lys Ala Ph - #e Asp Val Arg Ala Phe         #          11505                                                              - Thr Arg Asp Gly His Met Ala Asp Thr Leu Le - #u Leu Leu Leu Leu Tyr         #      11650                                                                  - Gly Trp Ala Ile Ile Pro Leu Met Tyr Leu Me - #t Asn Phe Phe Phe Leu         #  11805                                                                      - Gly Ala Ala Thr Ala Tyr Thr Arg Leu Thr Il - #e Phe Asn Ile Leu Ser         #               12001190 - #                1195                              - Gly Ile Ala Thr Phe Leu Met Val Thr Ile Me - #t Arg Ile Pro Ala Val         #              12150                                                          - Lys Leu Glu Glu Leu Ser Lys Thr Leu Asp Hi - #s Val Phe Leu Val Leu         #          12305                                                              - Pro Asn His Cys Leu Gly Met Ala Val Ser Se - #r Phe Tyr Glu Asn Tyr         #      12450                                                                  - Glu Thr Arg Arg Tyr Cys Thr Ser Ser Glu Va - #l Ala Ala His Tyr Cys         #  12605                                                                      - Lys Lys Tyr Asn Ile Gln Tyr Gln Glu Asn Ph - #e Tyr Ala Trp Ser Ala         #               12801270 - #                1275                              - Pro Gly Val Gly Arg Phe Val Ala Ser Met Al - #a Ala Ser Gly Cys Ala         #              12950                                                          - Tyr Leu Ile Leu Leu Phe Leu Ile Glu Thr As - #n Leu Leu Gln Arg Leu         #          13105                                                              - Arg Gly Ile Leu Cys Ala Leu Arg Arg Arg Ar - #g Thr Leu Thr Glu Leu         #      13250                                                                  - Tyr Thr Arg Met Pro Val Leu Pro Glu Asp Gl - #n Asp Val Ala Asp Glu         #  13405                                                                      - Arg Thr Arg Ile Leu Ala Pro Ser Pro Asp Se - #r Leu Leu His Thr Pro         #               13601350 - #                1355                              - Leu Ile Ile Lys Glu Leu Ser Lys Val Tyr Gl - #u Gln Arg Val Pro Leu         #              13750                                                          - Leu Ala Val Asp Arg Leu Ser Leu Ala Val Gl - #n Lys Gly Glu Cys Phe         #          13905                                                              - Gly Leu Leu Gly Phe Asn Gly Ala Gly Lys Th - #r Thr Thr Phe Lys Met         #      14050                                                                  - Leu Thr Gly Glu Glu Ser Leu Thr Ser Gly As - #p Ala Phe Val Gly Gly         #  14205                                                                      - His Arg Ile Ser Ser Asp Val Gly Lys Val Ar - #g Gln Arg Ile Gly Tyr         #               14401430 - #                1435                              - Cys Pro Gln Phe Asp Ala Leu Leu Asp His Me - #t Thr Gly Arg Glu Met         #              14550                                                          - Leu Val Met Tyr Ala Arg Leu Arg Gly Ile Pr - #o Glu Arg His Ile Gly         #          14705                                                              - Ala Cys Val Glu Asn Thr Leu Arg Gly Leu Le - #u Leu Glu Pro His Ala         #      14850                                                                  - Asn Lys Leu Val Arg Thr Tyr Ser Gly Gly As - #n Lys Arg Lys Leu Ser         #  15005                                                                      - Thr Gly Ile Ala Leu Ile Gly Glu Pro Ala Va - #l Ile Phe Leu Asp Glu         #               15201510 - #                1515                              - Pro Ser Thr Gly Met Asp Pro Val Ala Arg Ar - #g Leu Leu Trp Asp Thr         #              15350                                                          - Val Ala Arg Ala Arg Glu Ser Gly Lys Ala Il - #e Ile Ile Thr Ser His         #          15505                                                              - Ser Met Glu Glu Cys Glu Ala Leu Cys Thr Ar - #g Leu Ala Ile Met Val         #      15650                                                                  - Gln Gly Gln Phe Lys Cys Leu Gly Ser Pro Gl - #n His Leu Lys Ser Lys         #  15805                                                                      - Phe Gly Ser Gly Tyr Ser Leu Arg Ala Lys Va - #l Gln Ser Glu Gly Gln         #               16001590 - #                1595                              - Gln Glu Ala Leu Glu Glu Phe Lys Ala Phe Va - #l Asp Leu Thr Phe Pro         #              16150                                                          - Gly Ser Val Leu Glu Asp Glu His Gln Gly Me - #t Val His Tyr His Leu         #          16305                                                              - Pro Gly Arg Asp Leu Ser Trp Ala Lys Val Ph - #e Gly Ile Leu Glu Lys         #      16450                                                                  - Ala Lys Glu Lys Tyr Gly Val Asp Asp Tyr Se - #r Val Ser Gln Ile Ser         #  16605                                                                      - Leu Glu Gln Val Phe Leu Ser Phe Ala His Le - #u Gln Pro Pro Thr Ala         #               16801670 - #                1675                              - Glu Glu Gly Arg                                                             - (2) INFORMATION FOR SEQ ID NO:26:                                           -      (i) SEQUENCE CHARACTERISTICS:                                          #acids    (A) LENGTH: 1375 amino                                                        (B) TYPE: amino acid                                                          (C) STRANDEDNESS: Not R - #elevant                                            (D) TOPOLOGY: unknown                                               -     (ii) MOLECULE TYPE: protein                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:26:                                - Cys Met Glu Glu Glu Pro Thr His Leu Arg Le - #u Gly Val Ser Ile Gln         #                15                                                           - Asn Leu Val Lys Val Tyr Arg Asp Gly Met Ly - #s Val Ala Val Asp Gly         #            30                                                               - Leu Ala Leu Asn Phe Tyr Glu Gly Gln Ile Th - #r Ser Phe Leu Gly His         #        45                                                                   - Asn Gly Ala Gly Lys Thr Thr Thr Met Ser Il - #e Leu Thr Gly Leu Phe         #    60                                                                       - Pro Pro Thr Ser Gly Thr Ala Tyr Ile Leu Gl - #y Lys Asp Ile Arg Ser         #80                                                                           - Glu Met Ser Ser Ile Arg Gln Asn Leu Gly Va - #l Cys Pro Gln His Asn         #                95                                                           - Val Leu Phe Asp Met Leu Thr Val Glu Glu Hi - #s Ile Trp Phe Tyr Ala         #           110                                                               - Arg Leu Lys Gly Leu Ser Glu Lys His Val Ly - #s Ala Glu Met Glu Gln         #       125                                                                   - Met Ala Leu Asp Val Gly Leu Pro Pro Ser Ly - #s Leu Lys Ser Lys Thr         #   140                                                                       - Ser Gln Leu Ser Gly Gly Met Gln Arg Lys Le - #u Ser Val Ala Leu Ala         145                 1 - #50                 1 - #55                 1 -       #60                                                                           - Phe Val Gly Gly Ser Lys Val Val Ile Leu As - #p Glu Pro Thr Ala Gly         #               175                                                           - Val Asp Pro Tyr Ser Arg Arg Gly Ile Trp Gl - #u Leu Leu Leu Lys Tyr         #           190                                                               - Arg Gln Gly Arg Thr Ile Ile Leu Ser Thr Hi - #s His Met Asp Glu Ala         #       205                                                                   - Asp Ile Leu Gly Asp Arg Ile Ala Ile Ile Se - #r His Gly Lys Leu Cys         #   220                                                                       - Cys Val Gly Ser Ser Leu Phe Leu Lys Asn Gl - #n Leu Gly Thr Gly Tyr         225                 2 - #30                 2 - #35                 2 -       #40                                                                           - Tyr Leu Thr Leu Val Lys Lys Asp Val Glu Se - #r Ser Leu Ser Ser Cys         #               255                                                           - Arg Asn Ser Ser Ser Thr Val Ser Cys Leu Ly - #s Lys Glu Asp Ser Val         #           270                                                               - Ser Gln Ser Ser Ser Asp Ala Gly Leu Gly Se - #r Asp His Glu Ser Asp         #       285                                                                   - Thr Leu Thr Ile Asp Val Ser Ala Ile Ser As - #n Leu Ile Arg Lys His         #   300                                                                       - Val Ser Glu Ala Arg Leu Val Glu Asp Ile Gl - #y His Glu Leu Thr Tyr         305                 3 - #10                 3 - #15                 3 -       #20                                                                           - Val Leu Pro Tyr Glu Ala Ala Lys Glu Gly Al - #a Phe Val Glu Leu Phe         #               335                                                           - His Glu Ile Asp Asp Arg Leu Ser Asp Leu Gl - #y Ile Ser Ser Tyr Gly         #           350                                                               - Ile Ser Glu Thr Thr Leu Glu Glu Ile Phe Le - #u Lys Val Ala Glu Glu         #       365                                                                   - Ser Gly Val Asp Ala Glu Thr Ser Asp Gly Th - #r Leu Pro Ala Arg Arg         #   380                                                                       - Asn Arg Arg Ala Phe Gly Asp Lys Gln Ser Cy - #s Leu His Pro Phe Thr         385                 3 - #90                 3 - #95                 4 -       #00                                                                           - Glu Asp Asp Ala Val Asp Pro Asn Asp Ser As - #p Ile Asp Pro Glu Ser         #               415                                                           - Arg Glu Thr Asp Leu Leu Ser Gly Met Asp Gl - #y Lys Gly Ser Tyr Gln         #           430                                                               - Leu Lys Gly Trp Lys Leu Thr Gln Gln Gln Ph - #e Val Ala Leu Leu Trp         #       445                                                                   - Lys Arg Leu Leu Ile Ala Arg Arg Ser Arg Ly - #s Gly Phe Phe Ala Gln         #   460                                                                       - Ile Val Leu Pro Ala Val Phe Val Cys Ile Al - #a Leu Val Phe Ser Leu         465                 4 - #70                 4 - #75                 4 -       #80                                                                           - Ile Val Pro Pro Phe Gly Lys Tyr Pro Ser Le - #u Glu Leu Gln Pro Trp         #               495                                                           - Met Tyr Asn Glu Gln Tyr Thr Phe Val Ser As - #n Asp Ala Pro Glu Asp         #           510                                                               - Met Gly Thr Gln Glu Leu Leu Asn Ala Leu Th - #r Lys Asp Pro Gly Phe         #       525                                                                   - Gly Thr Arg Cys Met Glu Gly Asn Pro Ile Pr - #o Asp Thr Pro Cys Leu         #   540                                                                       - Ala Gly Glu Glu Asp Trp Thr Ile Ser Pro Va - #l Pro Gln Ser Ile Val         545                 5 - #50                 5 - #55                 5 -       #60                                                                           - Asp Leu Phe Gln Asn Gly Asn Trp Thr Met Ly - #s Asn Pro Ser Pro Ala         #               575                                                           - Cys Gln Cys Ser Ser Asp Lys Ile Lys Lys Me - #t Leu Pro Val Cys Pro         #           590                                                               - Pro Gly Ala Gly Gly Leu Pro Pro Pro Gln Ar - #g Lys Gln Lys Thr Ala         #       605                                                                   - Asp Ile Leu Gln Asn Leu Thr Gly Arg Asn Il - #e Ser Asp Tyr Leu Val         #   620                                                                       - Lys Thr Tyr Val Gln Ile Ile Ala Lys Ser Le - #u Lys Asn Lys Ile Trp         625                 6 - #30                 6 - #35                 6 -       #40                                                                           - Val Asn Glu Phe Arg Tyr Gly Gly Phe Ser Le - #u Gly Val Ser Asn Ser         #               655                                                           - Gln Ala Leu Pro Pro Ser His Glu Val Asn As - #p Ala Ile Lys Gln Met         #           670                                                               - Lys Lys Leu Leu Lys Leu Thr Lys Asp Thr Se - #r Ala Asp Arg Phe Leu         #       685                                                                   - Ser Ser Leu Gly Arg Phe Met Ala Gly Leu As - #p Thr Lys Asn Asn Val         #   700                                                                       - Lys Val Trp Phe Asn Asn Lys Gly Trp His Al - #a Ile Ser Ser Phe Leu         705                 7 - #10                 7 - #15                 7 -       #20                                                                           - Asn Val Ile Asn Asn Ala Ile Leu Arg Ala As - #n Leu Gln Lys Gly Glu         #               735                                                           - Asn Pro Ser Gln Tyr Gly Ile Thr Ala Phe As - #n His Pro Leu Asn Leu         #           750                                                               - Thr Lys Gln Gln Leu Ser Glu Val Ala Leu Me - #t Thr Thr Ser Val Asp         #       765                                                                   - Val Leu Val Ser Ile Cys Val Ile Phe Ala Me - #t Ser Phe Val Pro Ala         #   780                                                                       - Ser Phe Val Val Phe Leu Ile Gln Glu Arg Va - #l Ser Lys Ala Lys His         785                 7 - #90                 7 - #95                 8 -       #00                                                                           - Leu Gln Phe Ile Ser Gly Val Lys Pro Val Il - #e Tyr Trp Leu Ser Asn         #               815                                                           - Phe Val Trp Asp Met Cys Asn Tyr Val Val Pr - #o Ala Thr Leu Val Ile         #           830                                                               - Ile Ile Phe Ile Cys Phe Gln Gln Lys Ser Ty - #r Val Ser Ser Thr Asn         #       845                                                                   - Leu Pro Val Leu Ala Leu Leu Leu Leu Leu Ty - #r Gly Trp Ser Ile Thr         #   860                                                                       - Pro Leu Met Tyr Pro Ala Ser Phe Val Phe Ly - #s Ile Pro Ser Thr Ala         865                 8 - #70                 8 - #75                 8 -       #80                                                                           - Tyr Val Val Leu Thr Ser Val Asn Leu Phe Il - #e Gly Ile Asn Gly Ser         #               895                                                           - Val Ala Thr Phe Val Leu Glu Leu Phe Thr As - #n Asn Lys Leu Asn Asp         #           910                                                               - Ile Asn Asp Ile Leu Lys Ser Val Phe Leu Il - #e Phe Pro His Phe Cys         #       925                                                                   - Leu Gly Arg Gly Leu Ile Asp Met Val Lys As - #n Gln Ala Met Ala Asp         #   940                                                                       - Ala Leu Glu Arg Phe Gly Glu Asn Arg Phe Va - #l Ser Pro Leu Ser Trp         945                 9 - #50                 9 - #55                 9 -       #60                                                                           - Asp Leu Val Gly Arg Asn Leu Phe Ala Met Al - #a Val Glu Gly Val Val         #               975                                                           - Phe Phe Leu Ile Thr Val Leu Ile Gln Tyr Ar - #g Phe Phe Ile Arg Pro         #           990                                                               - Arg Pro Val Lys Ala Lys Leu Pro Pro Leu As - #n Asp Glu Asp Glu Asp         #      10050                                                                  - Val Arg Arg Glu Arg Gln Arg Ile Leu Asp Gl - #y Gly Gly Gln Asn Asp         #  10205                                                                      - Ile Leu Glu Ile Lys Glu Leu Thr Lys Ile Ty - #r Arg Arg Lys Arg Lys         #               10401030 - #                1035                              - Pro Ala Val Asp Arg Ile Cys Ile Gly Ile Pr - #o Pro Gly Glu Cys Phe         #              10550                                                          - Gly Leu Leu Gly Val Asn Gly Ala Gly Lys Se - #r Thr Thr Phe Lys Met         #          10705                                                              - Leu Thr Gly Asp Thr Pro Val Thr Arg Gly As - #p Ala Phe Leu Asn Lys         #      10850                                                                  - Asn Ser Ile Leu Ser Asn Ile His Glu Val Hi - #s Gln Asn Met Gly Tyr         #  11005                                                                      - Cys Pro Gln Phe Asp Ala Ile Thr Glu Leu Le - #u Thr Gly Arg Glu His         #               11201110 - #                1115                              - Val Glu Phe Phe Ala Leu Leu Arg Gly Val Pr - #o Glu Lys Glu Val Gly         #              11350                                                          - Lys Phe Gly Glu Trp Ala Ile Arg Lys Leu Gl - #y Leu Val Lys Tyr Gly         #          11505                                                              - Glu Lys Tyr Ala Ser Asn Tyr Ser Gly Gly As - #n Lys Arg Lys Leu Ser         #      11650                                                                  - Thr Ala Met Ala Leu Ile Gly Gly Pro Pro Va - #l Val Phe Leu Asp Glu         #  11805                                                                      - Pro Thr Thr Gly Met Asp Pro Lys Ala Arg Ar - #g Phe Leu Trp Asn Cys         #               12001190 - #                1195                              - Ala Leu Ser Ile Val Lys Glu Gly Arg Ser Va - #l Val Leu Thr Ser His         #              12150                                                          - Ser Met Glu Glu Cys Glu Ala Leu Cys Thr Ar - #g Met Ala Ile Met Val         #          12305                                                              - Asn Gly Arg Phe Arg Cys Leu Gly Ser Val Gl - #n His Leu Lys Asn Arg         #      12450                                                                  - Phe Gly Asp Gly Tyr Thr Ile Val Val Arg Il - #e Ala Gly Ser Asn Pro         #  12605                                                                      - Asp Leu Lys Pro Val Gln Glu Phe Phe Gly Le - #u Ala Phe Pro Gly Ser         #               12801270 - #                1275                              - Val Leu Lys Glu Lys His Arg Asn Met Leu Gl - #n Tyr Gln Leu Pro Ser         #              12950                                                          - Ser Leu Ser Ser Leu Ala Arg Ile Phe Ser Il - #e Leu Ser Gln Ser Lys         #          13105                                                              - Lys Arg Leu His Ile Glu Asp Tyr Ser Val Se - #r Gln Thr Thr Leu Asp         #      13250                                                                  - Gln Val Phe Val Asn Phe Ala Lys Asp Gln Se - #r Asp Asp Asp His Leu         #  13405                                                                      - Lys Asp Leu Ser Leu His Lys Asn Gln Thr Va - #l Val Asp Val Ala Val         #               13601350 - #                1355                              - Leu Thr Ser Phe Leu Gln Asp Glu Lys Val Ly - #s Glu Ser Tyr Val             #              13750                                                          - (2) INFORMATION FOR SEQ ID NO:27:                                           -      (i) SEQUENCE CHARACTERISTICS:                                          #acids    (A) LENGTH: 1457 amino                                                        (B) TYPE: amino acid                                                          (C) STRANDEDNESS: Not R - #elevant                                            (D) TOPOLOGY: unknown                                               -     (ii) MOLECULE TYPE: protein                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:27:                                - Met Glu Glu Glu Pro Thr His Leu Pro Leu Va - #l Val Cys Val Asp Lys         #                15                                                           - Leu Thr Lys Val Tyr Lys Asn Asp Lys Lys Le - #u Ala Leu Asn Lys Leu         #            30                                                               - Ser Leu Asn Leu Tyr Glu Asn Gln Val Val Se - #r Phe Leu Gly His Asn         #        45                                                                   - Gly Ala Gly Lys Thr Thr Thr Met Ser Ile Le - #u Thr Gly Leu Phe Pro         #    60                                                                       - Pro Thr Ser Gly Ser Ala Thr Ile Tyr Gly Hi - #s Asp Ile Arg Thr Glu         #80                                                                           - Met Asp Glu Ile Arg Lys Asn Leu Gly Met Cy - #s Pro Gln His Asn Val         #                95                                                           - Leu Phe Asp Arg Leu Thr Val Glu Glu His Le - #u Trp Phe Tyr Ser Arg         #           110                                                               - Leu Lys Ser Met Ala Gln Glu Glu Ile Arg Ly - #s Glu Thr Asp Lys Met         #       125                                                                   - Ile Glu Asp Leu Glu Leu Ser Asn Lys Arg Hi - #s Ser Leu Val Gln Thr         #   140                                                                       - Leu Ser Gly Gly Met Lys Arg Lys Leu Ser Va - #l Ala Ile Ala Phe Val         145                 1 - #50                 1 - #55                 1 -       #60                                                                           - Gly Gly Ser Arg Ala Ile Ile Leu Asp Glu Pr - #o Thr Ala Gly Val Asp         #               175                                                           - Pro Tyr Ala Arg Arg Ala Ile Trp Asp Leu Il - #e Leu Lys Tyr Lys Pro         #           190                                                               - Gly Arg Thr Ile Leu Leu Ser Thr His His Me - #t Asp Glu Ala Asp Leu         #       205                                                                   - Leu Gly Asp Arg Ile Ala Ile Ile Ser His Gl - #y Lys Leu Lys Cys Cys         #   220                                                                       - Gly Ser Pro Leu Phe Leu Lys Gly Ala Tyr Xa - #a Asp Gly Tyr Arg Leu         225                 2 - #30                 2 - #35                 2 -       #40                                                                           - Thr Leu Val Lys Gln Pro Ala Glu Pro Gly Th - #r Ser Gln Glu Pro Gly         #               255                                                           - Leu Ala Ser Ser Pro Ser Gly Cys Pro Arg Le - #u Ser Ser Cys Ser Glu         #           270                                                               - Pro Gln Val Ser Gln Phe Ile Arg Lys His Va - #l Ala Ser Ser Leu Leu         #       285                                                                   - Val Ser Asp Thr Ser Thr Glu Leu Ser Tyr Il - #e Leu Pro Ser Glu Ala         #   300                                                                       - Val Lys Lys Gly Ala Phe Glu Arg Leu Phe Gl - #n Gln Leu Glu His Ser         305                 3 - #10                 3 - #15                 3 -       #20                                                                           - Leu Asp Ala Leu His Leu Ser Ser Phe Gly Le - #u Met Asp Thr Thr Leu         #               335                                                           - Glu Glu Val Phe Leu Lys Val Ser Glu Glu As - #p Gln Ser Leu Glu Asn         #           350                                                               - Ser Glu Ala Asp Val Lys Glu Ser Arg Lys As - #p Val Leu Pro Gly Ala         #       365                                                                   - Glu Gly Leu Thr Ala Val Gly Gly Gln Ala Gl - #y Asn Leu Ala Arg Cys         #   380                                                                       - Ser Glu Leu Ala Gln Ser Gln Ala Ser Leu Gl - #n Ser Ala Ser Ser Val         385                 3 - #90                 3 - #95                 4 -       #00                                                                           - Gly Ser Ala Arg Gly Glu Glu Gly Thr Gly Ty - #r Ser Asp Gly Tyr Gly         #               415                                                           - Asp Tyr Arg Pro Leu Phe Asp Asn Leu Gln As - #p Pro Asp Asn Val Ser         #           430                                                               - Leu Gln Glu Ala Glu Met Glu Ala Leu Ala Gl - #n Val Gly Gln Gly Ser         #       445                                                                   - Arg Lys Leu Glu Gly Trp Trp Leu Lys Met Ar - #g Gln Phe His Gly Leu         #   460                                                                       - Leu Val Lys Arg Phe His Cys Ala Arg Arg As - #n Ser Lys Ala Leu Cys         465                 4 - #70                 4 - #75                 4 -       #80                                                                           - Ser Gln Ile Leu Leu Pro Ala Phe Phe Val Cy - #s Val Ala Met Thr Val         #               495                                                           - Ala Leu Ser Val Pro Glu Ile Gly Asp Leu Pr - #o Pro Leu Val Leu Ser         #           510                                                               - Pro Ser Gln Tyr His Asn Tyr Thr Gln Pro Ar - #g Gly Asn Phe Ile Pro         #       525                                                                   - Tyr Ala Asn Glu Glu Arg Gln Glu Tyr Arg Le - #u Arg Leu Ser Pro Asp         #   540                                                                       - Ala Ser Pro Gln Gln Leu Val Ser Thr Phe Ar - #g Leu Pro Ser Gly Val         545                 5 - #50                 5 - #55                 5 -       #60                                                                           - Gly Ala Thr Cys Val Leu Lys Ser Pro Ala As - #n Gly Ser Leu Gly Pro         #               575                                                           - Met Leu Asn Leu Ser Ser Gly Glu Ser Arg Le - #u Leu Ala Ala Arg Phe         #           590                                                               - Phe Asp Ser Met Cys Leu Glu Ser Phe Thr Gl - #n Gly Leu Pro Leu Ser         #       605                                                                   - Asn Phe Val Pro Pro Pro Pro Ser Pro Ala Pr - #o Ser Asp Ser Pro Val         #   620                                                                       - Xaa Pro Asp Glu Asp Ser Leu Gln Ala Trp As - #n Met Ser Leu Pro Pro         625                 6 - #30                 6 - #35                 6 -       #40                                                                           - Thr Ala Gly Pro Glu Thr Trp Thr Ser Ala Pr - #o Ser Leu Pro Arg Leu         #               655                                                           - Val His Glu Pro Val Arg Cys Thr Cys Ser Al - #a Gln Gly Thr Gly Phe         #           670                                                               - Ser Cys Pro Ser Ser Val Gly Gly His Pro Pr - #o Gln Met Arg Val Val         #       685                                                                   - Thr Gly Asp Ile Leu Thr Asp Ile Thr Gly Hi - #s Asn Val Ser Glu Tyr         #   700                                                                       - Leu Leu Phe Thr Ser Asp Arg Phe Arg Leu Hi - #s Arg Tyr Gly Ala Ile         705                 7 - #10                 7 - #15                 7 -       #20                                                                           - Thr Phe Gly Asn Val Gln Lys Ser Ile Pro Al - #a Ser Phe Gly Ala Arg         #               735                                                           - Val Pro Pro Met Val Arg Lys Ile Ala Val Ar - #g Arg Val Ala Gln Val         #           750                                                               - Leu Tyr Asn Asn Lys Gly Tyr His Ser Met Pr - #o Thr Tyr Leu Asn Ser         #       765                                                                   - Leu Asn Asn Ala Ile Leu Arg Ala Asn Leu Pr - #o Lys Ser Lys Gly Asn         #   780                                                                       - Pro Ala Ala Tyr Xaa Ile Thr Val Thr Asn Hi - #s Pro Met Asn Lys Thr         785                 7 - #90                 7 - #95                 8 -       #00                                                                           - Ser Ala Ser Leu Ser Leu Asp Tyr Leu Leu Gl - #n Gly Thr Asp Val Val         #               815                                                           - Ile Ala Ile Phe Ile Ile Val Ala Met Ser Ph - #e Val Pro Ala Ser Phe         #           830                                                               - Val Val Phe Leu Val Ala Glu Lys Ser Thr Ly - #s Ala Lys His Leu Gln         #       845                                                                   - Phe Val Ser Gly Cys Asn Pro Val Ile Tyr Tr - #p Leu Ala Asn Tyr Val         #   860                                                                       - Trp Asp Met Leu Asn Tyr Leu Val Pro Ala Th - #r Cys Cys Val Ile Ile         865                 8 - #70                 8 - #75                 8 -       #80                                                                           - Leu Phe Val Phe Asp Leu Pro Ala Tyr Thr Se - #r Pro Thr Asn Phe Pro         #               895                                                           - Ala Val Leu Ser Leu Phe Leu Leu Tyr Gly Tr - #p Ser Ile Thr Pro Ile         #           910                                                               - Met Tyr Pro Ala Ser Phe Trp Phe Glu Val Pr - #o Ser Ser Ala Tyr Val         #       925                                                                   - Phe Leu Ile Val Ile Asn Leu Phe Ile Gly Il - #e Thr Ala Thr Val Ala         #   940                                                                       - Thr Phe Leu Leu Gln Leu Phe Glu His Asp Ly - #s Asp Leu Lys Val Val         945                 9 - #50                 9 - #55                 9 -       #60                                                                           - Asn Ser Tyr Leu Lys Ser Cys Phe Leu Ile Ph - #e Pro Asn Tyr Asn Leu         #               975                                                           - Gly His Gly Leu Met Glu Met Ala Tyr Asn Gl - #u Tyr Ile Asn Glu Tyr         #           990                                                               - Tyr Ala Lys Ile Gly Gln Phe Asp Lys Met Ly - #s Ser Pro Phe Glu Trp         #      10050                                                                  - Asp Ile Val Thr Arg Gly Leu Val Ala Met Th - #r Val Glu Gly Phe Val         #  10205                                                                      - Gly Phe Phe Leu Thr Ile Met Cys Gln Tyr As - #n Phe Leu Arg Gln Pro         #               10401030 - #                1035                              - Gln Arg Leu Pro Val Ser Thr Lys Pro Val Gl - #u Asp Asp Val Asp Val         #              10550                                                          - Ala Ser Glu Arg Gln Arg Val Leu Arg Gly As - #p Ala Asp Asn Asp Met         #          10705                                                              - Val Lys Ile Glu Asn Leu Thr Lys Val Tyr Ly - #s Ser Arg Lys Ile Gly         #      10850                                                                  - Arg Ile Leu Ala Val Asp Arg Leu Cys Leu Gl - #y Val Cys Val Pro Gly         #  11005                                                                      - Glu Cys Phe Gly Leu Leu Gly Val Asn Gly Al - #a Gly Lys Thr Ser Thr         #               11201110 - #                1115                              - Phe Lys Met Leu Thr Gly Asp Glu Ser Thr Th - #r Gly Gly Glu Ala Phe         #              11350                                                          - Val Asn Gly His Ser Val Leu Lys Asp Leu Le - #u Gln Val Gln Gln Ser         #          11505                                                              - Leu Gly Tyr Cys Pro Gln Phe Asp Val Pro Va - #l Asp Glu Leu Thr Ala         #      11650                                                                  - Arg Glu His Leu Gln Leu Tyr Thr Arg Leu Ar - #g Cys Ile Pro Trp Lys         #  11805                                                                      - Asp Glu Ala Gln Val Val Lys Trp Ala Leu Gl - #u Lys Leu Glu Leu Thr         #               12001190 - #                1195                              - Lys Tyr Ala Asp Lys Pro Ala Gly Thr Tyr Se - #r Gly Gly Asn Lys Arg         #              12150                                                          - Lys Leu Ser Thr Ala Ile Ala Leu Ile Gly Ty - #r Pro Ala Phe Ile Phe         #          12305                                                              - Leu Asp Glu Pro Thr Thr Gly Met Asp Pro Ly - #s Ala Arg Arg Phe Leu         #      12450                                                                  - Trp Asn Leu Ile Leu Asp Leu Ile Lys Thr Gl - #y Arg Ser Val Val Leu         #  12605                                                                      - Thr Ser His Ser Met Glu Glu Cys Glu Ala Le - #u Cys Thr Arg Leu Ala         #               12801270 - #                1275                              - Ile Met Val Asn Gly Arg Leu His Cys Leu Gl - #y Ser Ile Gln His Leu         #              12950                                                          - Lys Asn Arg Phe Gly Asp Gly Tyr Met Ile Th - #r Val Arg Thr Lys Ser         #          13105                                                              - Ser Gln Asn Val Lys Asp Val Val Arg Phe Ph - #e Asn Arg Asn Phe Pro         #      13250                                                                  - Glu Ala His Ala Gln Gly Lys Thr Pro Tyr Ly - #s Val Gln Tyr Gln Leu         #  13405                                                                      - Lys Ser Glu His Ile Ser Leu Ala Gln Val Ph - #e Ser Lys Met Glu Gln         #               13601350 - #                1355                              - Val Val Gly Val Leu Gly Ile Glu Asp Tyr Se - #r Val Ser Gln Thr Thr         #              13750                                                          - Leu Asp Asn Val Phe Val Asn Phe Ala Lys Ly - #s Gln Ser Asp Asn Val         #          13905                                                              - Glu Gln Gln Glu Ala Glu Pro Ser Ser Leu Pr - #o Ser Pro Leu Gly Leu         #      14050                                                                  - Leu Ser Leu Leu Arg Pro Arg Pro Ala Pro Th - #r Glu Leu Arg Ala Leu         #  14205                                                                      - Val Ala Asp Glu Pro Glu Asp Leu Asp Thr Gl - #u Asp Glu Gly Leu Ile         #               14401430 - #                1435                              - Ser Phe Glu Glu Glu Arg Ala Gln Leu Ser Ph - #e Asn Thr Asp Thr Leu         #              14550                                                          - Cys                                                                         - (2) INFORMATION FOR SEQ ID NO:28:                                           -      (i) SEQUENCE CHARACTERISTICS:                                          #pairs    (A) LENGTH: 1548 base                                                         (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                -     (ii) MOLECULE TYPE: cDNA                                                -     (ix) FEATURE:                                                                     (A) NAME/KEY: CDS                                                             (B) LOCATION: 49..1271                                              -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:28:                                #TCC CAC      57GAGGCCC CTTCCTGTAC CTTCAGGGAT CGGCCACC ATG                    #                 Met - # Ser His                                             # 1                                                                           - CGG AAG TTT TCC GCC CCT CGG CAC GGA CAC CT - #G GGC TTC CTG CCC CAT          105                                                                          Arg Lys Phe Ser Ala Pro Arg His Gly His Le - #u Gly Phe Leu Pro His           #      15                                                                     - AAG AGG AGC CAC CGG CAC CGG GGC AAG GTG AA - #G ACG TGG CCG CGG GAT          153                                                                          Lys Arg Ser His Arg His Arg Gly Lys Val Ly - #s Thr Trp Pro Arg Asp           # 35                                                                          - GAC CCC AGC CAG CCC GTG CAC CTC ACG GCC TT - #C CTG GGC TAC AAG GCG          201                                                                          Asp Pro Ser Gln Pro Val His Leu Thr Ala Ph - #e Leu Gly Tyr Lys Ala           #                 50                                                          - GGC ATG ACC CAC ACC CTG CGG GAG GTG CAC CG - #G CCG GGG CTC AAA ATT          249                                                                          Gly Met Thr His Thr Leu Arg Glu Val His Ar - #g Pro Gly Leu Lys Ile           #             65                                                              - TCC AAA CGG GAG GAG GTG GAG GCG GTG ACA AT - #T GTA GAA ACG CCG CCC          297                                                                          Ser Lys Arg Glu Glu Val Glu Ala Val Thr Il - #e Val Glu Thr Pro Pro           #         80                                                                  - CTA GTG GTG GTG GGC GTG GTG GGC TAC GTG GC - #C ACC CCT CGA GGT CTC          345                                                                          Leu Val Val Val Gly Val Val Gly Tyr Val Al - #a Thr Pro Arg Gly Leu           #     95                                                                      - CGG AGC TTC AAG ACC ATC TTT GCA GAA CAC CT - #C AGT GAT GAG TGC CGG          393                                                                          Arg Ser Phe Lys Thr Ile Phe Ala Glu His Le - #u Ser Asp Glu Cys Arg           100                 1 - #05                 1 - #10                 1 -       #15                                                                           - CGC CGA TTC TAC AAG GAC TGG CAC AAG AGC AA - #G AAG AAA GCC TTC ACC          441                                                                          Arg Arg Phe Tyr Lys Asp Trp His Lys Ser Ly - #s Lys Lys Ala Phe Thr           #               130                                                           - AAG GCC TGC AAG AGG TGG CGG GAC ACA GAC GG - #G AAA AAG CAG CTA CAG          489                                                                          Lys Ala Cys Lys Arg Trp Arg Asp Thr Asp Gl - #y Lys Lys Gln Leu Gln           #           145                                                               - AAG GAC TTC GCC GCC ATG AAG AAG TAC TGC AA - #G GTC ATT CGG GTC ATT          537                                                                          Lys Asp Phe Ala Ala Met Lys Lys Tyr Cys Ly - #s Val Ile Arg Val Ile           #       160                                                                   - GTC CAC ACT CAG ATG AAA CTG CTG CCC TTC CG - #G CAG AAG AAG GCC CAC          585                                                                          Val His Thr Gln Met Lys Leu Leu Pro Phe Ar - #g Gln Lys Lys Ala His           #   175                                                                       - ATC ATG GAG ATC CAG CTG AAC GGT GGC ACG GT - #G GCC GAG AAG GTG GCC          633                                                                          Ile Met Glu Ile Gln Leu Asn Gly Gly Thr Va - #l Ala Glu Lys Val Ala           180                 1 - #85                 1 - #90                 1 -       #95                                                                           - TGG GCC CAG GCC CGG CTG GAG AAG CAG GTG CC - #C GTG CAC AGC GTG TTC          681                                                                          Trp Ala Gln Ala Arg Leu Glu Lys Gln Val Pr - #o Val His Ser Val Phe           #               210                                                           - AGC CAG AGT GAG GTC ATT GAT GTC ATT GCT GT - #C ACC AAG GGT CGA GGC          729                                                                          Ser Gln Ser Glu Val Ile Asp Val Ile Ala Va - #l Thr Lys Gly Arg Gly           #           225                                                               - GTC AAA GGG GTC ACA AGC CGC TGG CAT ACC AA - #G AAG CTG CCG CGC AAG          777                                                                          Val Lys Gly Val Thr Ser Arg Trp His Thr Ly - #s Lys Leu Pro Arg Lys           #       240                                                                   - ACC CAT AAG GGC CTG CGC AAG GTG GCC TGC AT - #T GGC GCC TGG CAC CCC          825                                                                          Thr His Lys Gly Leu Arg Lys Val Ala Cys Il - #e Gly Ala Trp His Pro           #   255                                                                       - GCC CGC GTG GGC TGC TCC ATT GCT CGG GCC GG - #G CAG AAG GGC TAT CAC          873                                                                          Ala Arg Val Gly Cys Ser Ile Ala Arg Ala Gl - #y Gln Lys Gly Tyr His           260                 2 - #65                 2 - #70                 2 -       #75                                                                           - CAC CGC ACG GAG CTC AAC AAG AAG ATC TTC CG - #C ATC GGC AGG GGC CCG          921                                                                          His Arg Thr Glu Leu Asn Lys Lys Ile Phe Ar - #g Ile Gly Arg Gly Pro           #               290                                                           - CAC ATG GAG GAC GGG AAG CTG GTG AAG AAC AA - #T GCA TCC ACC AGC TAC          969                                                                          His Met Glu Asp Gly Lys Leu Val Lys Asn As - #n Ala Ser Thr Ser Tyr           #           305                                                               - GAC GTG ACT GCC AAG TCC ATC ACA CCG CTG GG - #T GGC TTC CCC CAC TAC         1017                                                                          Asp Val Thr Ala Lys Ser Ile Thr Pro Leu Gl - #y Gly Phe Pro His Tyr           #       320                                                                   - GGG GAA GTG AAC AAC GAC TTC GTC ATG CTG AA - #G GGT TGT ATT GCT GGT         1065                                                                          Gly Glu Val Asn Asn Asp Phe Val Met Leu Ly - #s Gly Cys Ile Ala Gly           #   335                                                                       - ACC AAG AAG CGG GTC ATT ACG CTG AGA AAG TC - #C CTC CTG GTG CAT CAC         1113                                                                          Thr Lys Lys Arg Val Ile Thr Leu Arg Lys Se - #r Leu Leu Val His His           340                 3 - #45                 3 - #50                 3 -       #55                                                                           - AGT CGC CAA GCC GTG GAG AAT ATT GAG CTC AA - #G TTC ATT GAC ACC ACC         1161                                                                          Ser Arg Gln Ala Val Glu Asn Ile Glu Leu Ly - #s Phe Ile Asp Thr Thr           #               370                                                           - TCC AAG TTC GGC CAT GGC CGC TTC CAG ACA GC - #C CAA GAG AAG AGG GCC         1209                                                                          Ser Lys Phe Gly His Gly Arg Phe Gln Thr Al - #a Gln Glu Lys Arg Ala           #           385                                                               - TTC ATG GGC CCC CAA AAG AAG CAT CTG GAG AA - #G GAA ACG CCG GAG ACC         1257                                                                          Phe Met Gly Pro Gln Lys Lys His Leu Glu Ly - #s Glu Thr Pro Glu Thr           #       400                                                                   - TCG GGA GAC TTG TA GGCTGTGTGG GGTGGATGAA CCCTGAAGCG - # CACCGCACTG          1311                                                                          Ser Gly Asp Leu                                                                   405                                                                       - TCTGCCCCAA TGTCTAACAA AGGCCGGAGG CGACTCTTCC TGCGAGGTCT CA - #GAGCGCTG       1371                                                                          - TGTAACCGCC CAAGGGGTTC ACCTTGCCTG CTGCCTAGAC AAAGCCGATT CA - #TTAAGACA       1431                                                                          - GGGGAATTGC AATAGAGAAA GAGTAATTCA CACAGAGCTG GCTGTGCGGG AG - #ACCGGAGT       1491                                                                          - TTTATGTTTT ATTATTACTC AAATCGATCT CTTTGAGCAA AAAAAAAAAA AA - #AAAAA          1548                                                                          - (2) INFORMATION FOR SEQ ID NO:29:                                           -      (i) SEQUENCE CHARACTERISTICS:                                          #acids    (A) LENGTH: 407 amino                                                         (B) TYPE: amino acid                                                          (D) TOPOLOGY: linear                                                -     (ii) MOLECULE TYPE: protein                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:29:                                - Met Ser His Arg Lys Phe Ser Ala Pro Arg Hi - #s Gly His Leu Gly Phe         #                 15                                                          - Leu Pro His Lys Arg Ser His Arg His Arg Gl - #y Lys Val Lys Thr Trp         #             30                                                              - Pro Arg Asp Asp Pro Ser Gln Pro Val His Le - #u Thr Ala Phe Leu Gly         #         45                                                                  - Tyr Lys Ala Gly Met Thr His Thr Leu Arg Gl - #u Val His Arg Pro Gly         #     60                                                                      - Leu Lys Ile Ser Lys Arg Glu Glu Val Glu Al - #a Val Thr Ile Val Glu         # 80                                                                          - Thr Pro Pro Leu Val Val Val Gly Val Val Gl - #y Tyr Val Ala Thr Pro         #                 95                                                          - Arg Gly Leu Arg Ser Phe Lys Thr Ile Phe Al - #a Glu His Leu Ser Asp         #           110                                                               - Glu Cys Arg Arg Arg Phe Tyr Lys Asp Trp Hi - #s Lys Ser Lys Lys Lys         #       125                                                                   - Ala Phe Thr Lys Ala Cys Lys Arg Trp Arg As - #p Thr Asp Gly Lys Lys         #   140                                                                       - Gln Leu Gln Lys Asp Phe Ala Ala Met Lys Ly - #s Tyr Cys Lys Val Ile         145                 1 - #50                 1 - #55                 1 -       #60                                                                           - Arg Val Ile Val His Thr Gln Met Lys Leu Le - #u Pro Phe Arg Gln Lys         #               175                                                           - Lys Ala His Ile Met Glu Ile Gln Leu Asn Gl - #y Gly Thr Val Ala Glu         #           190                                                               - Lys Val Ala Trp Ala Gln Ala Arg Leu Glu Ly - #s Gln Val Pro Val His         #       205                                                                   - Ser Val Phe Ser Gln Ser Glu Val Ile Asp Va - #l Ile Ala Val Thr Lys         #   220                                                                       - Gly Arg Gly Val Lys Gly Val Thr Ser Arg Tr - #p His Thr Lys Lys Leu         225                 2 - #30                 2 - #35                 2 -       #40                                                                           - Pro Arg Lys Thr His Lys Gly Leu Arg Lys Va - #l Ala Cys Ile Gly Ala         #               255                                                           - Trp His Pro Ala Arg Val Gly Cys Ser Ile Al - #a Arg Ala Gly Gln Lys         #           270                                                               - Gly Tyr His His Arg Thr Glu Leu Asn Lys Ly - #s Ile Phe Arg Ile Gly         #       285                                                                   - Arg Gly Pro His Met Glu Asp Gly Lys Leu Va - #l Lys Asn Asn Ala Ser         #   300                                                                       - Thr Ser Tyr Asp Val Thr Ala Lys Ser Ile Th - #r Pro Leu Gly Gly Phe         305                 3 - #10                 3 - #15                 3 -       #20                                                                           - Pro His Tyr Gly Glu Val Asn Asn Asp Phe Va - #l Met Leu Lys Gly Cys         #               335                                                           - Ile Ala Gly Thr Lys Lys Arg Val Ile Thr Le - #u Arg Lys Ser Leu Leu         #           350                                                               - Val His His Ser Arg Gln Ala Val Glu Asn Il - #e Glu Leu Lys Phe Ile         #       365                                                                   - Asp Thr Thr Ser Lys Phe Gly His Gly Arg Ph - #e Gln Thr Ala Gln Glu         #   380                                                                       - Lys Arg Ala Phe Met Gly Pro Gln Lys Lys Hi - #s Leu Glu Lys Glu Thr         385                 3 - #90                 3 - #95                 4 -       #00                                                                           - Pro Glu Thr Ser Gly Asp Leu                                                                 405                                                           - (2) INFORMATION FOR SEQ ID NO:30:                                           -      (i) SEQUENCE CHARACTERISTICS:                                          #acids    (A) LENGTH: 403 amino                                                         (B) TYPE: amino acid                                                          (C) STRANDEDNESS: Not R - #elevant                                            (D) TOPOLOGY: unknown                                               -     (ii) MOLECULE TYPE: protein                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:30:                                - Met Ser His Arg Lys Phe Ser Ala Pro Arg Hi - #s Gly Ser Leu Gly Phe         #                15                                                           - Leu Pro Arg Lys Arg Ser Ser Arg His Arg Gl - #y Lys Val Lys Ser Phe         #            30                                                               - Pro Lys Asp Asp Pro Ser Lys Pro Val His Le - #u Thr Ala Phe Leu Gly         #        45                                                                   - Tyr Lys Ala Gly Met Thr His Ile Val Arg Gl - #u Val Asp Arg Pro Gly         #    60                                                                       - Ser Lys Val Asn Lys Lys Glu Val Val Glu Al - #a Val Thr Ile Val Glu         #80                                                                           - Thr Pro Pro Met Val Val Val Gly Ile Val Gl - #y Tyr Val Glu Thr Pro         #                95                                                           - Arg Gly Leu Arg Thr Phe Lys Thr Val Phe Al - #a Glu His Ile Ser Asp         #           110                                                               - Glu Cys Lys Arg Arg Phe Tyr Lys Asn Trp Hi - #s Lys Ser Lys Lys Lys         #       125                                                                   - Ala Phe Thr Lys Tyr Cys Lys Lys Trp Gln As - #p Glu Asp Gly Lys Lys         #   140                                                                       - Gln Leu Glu Lys Asp Phe Ser Ser Met Lys Ly - #s Tyr Cys Gln Val Ile         145                 1 - #50                 1 - #55                 1 -       #60                                                                           - Arg Val Ile Ala His Thr Gln Met Arg Leu Le - #u Pro Leu Arg Gln Lys         #               175                                                           - Lys Ala His Leu Met Glu Ile Gln Val Asn Gl - #y Gly Thr Val Ala Glu         #           190                                                               - Lys Leu Asp Trp Ala Arg Glu Arg Leu Glu Gl - #n Gln Val Pro Val Asn         #       205                                                                   - Gln Val Phe Gly Gln Asp Glu Met Ile Asp Va - #l Ile Gly Val Thr Lys         #   220                                                                       - Gly Lys Gly Tyr Lys Gly Val Thr Ser Arg Tr - #p His Thr Lys Lys Leu         225                 2 - #30                 2 - #35                 2 -       #40                                                                           - Pro Arg Lys Thr His Arg Gly Leu Arg Lys Va - #l Ala Cys Ile Gly Ala         #               255                                                           - Trp His Pro Ala Arg Val Ala Phe Ser Val Al - #a Arg Ala Gly Gln Lys         #           270                                                               - Gly Tyr His His Arg Thr Glu Ile Asn Lys Ly - #s Ile Tyr Lys Ile Gly         #       285                                                                   - Gln Gly Tyr Leu Ile Lys Asp Gly Lys Leu Il - #e Lys Asn Asn Ala Ser         #   300                                                                       - Thr Asp Tyr Asp Leu Ser Asp Lys Ser Ile As - #n Pro Leu Gly Gly Phe         305                 3 - #10                 3 - #15                 3 -       #20                                                                           - Val His Tyr Gly Glu Val Thr Asn Asp Phe Va - #l Met Leu Lys Gly Cys         #               335                                                           - Val Val Gly Thr Lys Lys Arg Val Leu Thr Le - #u Arg Lys Ser Leu Leu         #           350                                                               - Val Gln Thr Lys Arg Arg Ala Leu Glu Lys Il - #e Asp Leu Lys Phe Ile         #       365                                                                   - Asp Thr Thr Ser Lys Phe Gly His Gly Arg Ph - #e Gln Thr Met Glu Glu         #   380                                                                       - Lys Lys Ala Phe Met Gly Pro Leu Lys Lys As - #p Arg Ile Ala Lys Glu         385                 3 - #90                 3 - #95                 4 -       #00                                                                           - Glu Gly Ala                                                                 - (2) INFORMATION FOR SEQ ID NO:31:                                           -      (i) SEQUENCE CHARACTERISTICS:                                          #acids    (A) LENGTH: 403 amino                                                         (B) TYPE: amino acid                                                          (C) STRANDEDNESS: Not R - #elevant                                            (D) TOPOLOGY: unknown                                               -     (ii) MOLECULE TYPE: protein                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:31:                                - Met Ser His Arg Lys Phe Ser Ala Pro Arg Hi - #s Gly Ser Leu Gly Phe         #                15                                                           - Leu Pro Arg Lys Arg Ser Ser Arg His Arg Gl - #y Lys Val Lys Ser Phe         #            30                                                               - Pro Lys Asp Asp Ser Ser Lys Pro Val His Le - #u Thr Ala Phe Leu Gly         #        45                                                                   - Tyr Lys Ala Gly Met Thr His Ile Val Arg Gl - #u Val Asp Arg Pro Gly         #    60                                                                       - Ser Lys Val Asn Lys Lys Glu Val Val Glu Al - #a Val Thr Ile Val Glu         #80                                                                           - Thr Pro Pro Met Val Ile Val Gly Ile Val Gl - #y Tyr Val Glu Thr Pro         #                95                                                           - Arg Gly Leu Arg Thr Phe Lys Thr Ile Phe Al - #a Glu His Ile Ser Asp         #           110                                                               - Glu Cys Lys Arg Arg Phe Tyr Lys Asn Trp Hi - #s Lys Ser Lys Lys Lys         #       125                                                                   - Ala Phe Thr Lys Tyr Cys Lys Lys Trp Gln As - #p Ala Asp Gly Lys Lys         #   140                                                                       - Gln Leu Glu Arg Asp Phe Ser Ser Met Lys Ly - #s Tyr Cys Gln Val Ile         145                 1 - #50                 1 - #55                 1 -       #60                                                                           - Arg Val Ile Ala His Thr Gln Met Arg Leu Le - #u Pro Leu Arg Gln Lys         #               175                                                           - Lys Ala His Leu Met Glu Val Gln Val Asn Gl - #y Gly Thr Val Ala Glu         #           190                                                               - Lys Leu Asp Trp Ala Arg Glu Arg Leu Glu Gl - #n Gln Val Pro Val Asn         #       205                                                                   - Gln Val Phe Gly Gln Asp Glu Met Ile Asp Va - #l Ile Gly Val Thr Lys         #   220                                                                       - Gly Lys Gly Tyr Lys Gly Val Thr Ser Arg Tr - #p His Thr Lys Lys Leu         225                 2 - #30                 2 - #35                 2 -       #40                                                                           - Pro Arg Lys Thr His Arg Gly Leu Arg Lys Va - #l Ala Cys Ile Gly Ala         #               255                                                           - Trp His Pro Ala Arg Val Ala Phe Ser Val Al - #a Arg Ala Gly Gln Lys         #           270                                                               - Gly Tyr His His Arg Thr Glu Ile Asn Lys Ly - #s Ile Tyr Lys Ile Gly         #       285                                                                   - Gln Gly Tyr Leu Ile Lys Asp Gly Lys Leu Il - #e Lys Asn Asn Ala Ser         #   300                                                                       - Thr Asp Tyr Asp Leu Ser Asp Lys Ser Ile As - #n Pro Leu Gly Gly Phe         305                 3 - #10                 3 - #15                 3 -       #20                                                                           - Val His Tyr Gly Glu Val Thr Asn Asp Phe Va - #l Met Leu Lys Gly Cys         #               335                                                           - Val Val Gly Thr Lys Lys Arg Val Leu Thr Le - #u Arg Lys Ser Leu Leu         #           350                                                               - Val Gln Thr Lys Arg Arg Ala Leu Glu Lys Il - #e Asp Leu Lys Phe Ile         #       365                                                                   - Asp Thr Thr Ser Lys Phe Gly His Gly Arg Ph - #e Gln Thr Val Glu Glu         #   380                                                                       - Lys Lys Ala Phe Met Gly Pro Leu Lys Lys As - #p Arg Ile Ala Lys Glu         385                 3 - #90                 3 - #95                 4 -       #00                                                                           - Glu Gly Ala                                                                 - (2) INFORMATION FOR SEQ ID NO:32:                                           -      (i) SEQUENCE CHARACTERISTICS:                                          #acids    (A) LENGTH: 403 amino                                                         (B) TYPE: amino acid                                                          (C) STRANDEDNESS: Not R - #elevant                                            (D) TOPOLOGY: unknown                                               -     (ii) MOLECULE TYPE: protein                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:32:                                - Met Ser His Arg Lys Phe Ser Ala Pro Arg Hi - #s Gly Ser Leu Gly Phe         #                15                                                           - Leu Pro Arg Lys Arg Ser Ser Arg His Arg Gl - #y Lys Val Lys Ser Phe         #            30                                                               - Pro Lys Asp Asp Ala Ser Lys Pro Val His Le - #u Thr Ala Phe Leu Gly         #        45                                                                   - Tyr Lys Ala Gly Met Thr His Ile Val Arg Gl - #u Val Asp Arg Pro Gly         #    60                                                                       - Ser Lys Val Asn Lys Lys Glu Val Val Glu Al - #a Val Thr Ile Val Glu         #80                                                                           - Thr Pro Pro Met Val Val Val Gly Ile Val Gl - #y Tyr Val Glu Thr Pro         #                95                                                           - Arg Gly Leu Arg Thr Phe Lys Thr Val Phe Al - #a Glu His Ile Ser Asp         #           110                                                               - Glu Cys Lys Arg Arg Phe Tyr Lys Asn Trp Hi - #s Lys Ser Lys Lys Lys         #       125                                                                   - Ala Phe Thr Lys Tyr Cys Lys Lys Trp Gln As - #p Asp Thr Gly Lys Lys         #   140                                                                       - Gln Leu Glu Lys Asp Phe Asn Ser Met Lys Ly - #s Tyr Cys Gln Val Ile         145                 1 - #50                 1 - #55                 1 -       #60                                                                           - Arg Ile Ile Ala His Thr Gln Met Arg Leu Le - #u Pro Leu Arg Gln Lys         #               175                                                           - Lys Ala His Leu Met Glu Ile Gln Val Asn Gl - #y Gly Thr Val Ala Glu         #           190                                                               - Lys Leu Asp Trp Ala Arg Glu Arg Leu Glu Gl - #n Gln Val Pro Val Ser         #       205                                                                   - Gln Val Phe Gly Gln Asp Glu Met Ile Asp Va - #l Ile Gly Val Thr Lys         #   220                                                                       - Gly Lys Gly Tyr Lys Gly Val Thr Ser Arg Tr - #p His Thr Lys Lys Leu         225                 2 - #30                 2 - #35                 2 -       #40                                                                           - Pro Arg Lys Thr His Arg Gly Leu Arg Lys Va - #l Ala Cys Ile Gly Ala         #               255                                                           - Trp His Pro Ala Arg Val Ala Phe Thr Val Al - #a Arg Ala Gly Gln Lys         #           270                                                               - Gly Tyr His His Arg Thr Glu Ile Asn Lys Ly - #s Ile Tyr Lys Ile Gly         #       285                                                                   - Gln Gly Tyr Leu Ile Lys Asp Gly Lys Leu Il - #e Lys Asn Asn Ala Ser         #   300                                                                       - Thr Asp Tyr Asp Leu Ser Asp Lys Ser Ile As - #n Pro Leu Gly Gly Phe         305                 3 - #10                 3 - #15                 3 -       #20                                                                           - Val His Tyr Gly Glu Val Thr Asn Asp Phe Il - #e Met Leu Lys Gly Cys         #               335                                                           - Val Val Gly Thr Lys Lys Arg Val Leu Thr Le - #u Arg Lys Ser Leu Leu         #           350                                                               - Val Gln Thr Lys Arg Arg Ala Leu Glu Lys Il - #e Asp Leu Lys Phe Ile         #       365                                                                   - Asp Thr Thr Ser Lys Phe Gly His Gly Arg Ph - #e Gln Thr Met Glu Glu         #   380                                                                       - Lys Lys Ala Phe Met Gly Pro Leu Lys Lys As - #p Arg Ile Ala Lys Glu         385                 3 - #90                 3 - #95                 4 -       #00                                                                           - Glu Gly Ala                                                                 - (2) INFORMATION FOR SEQ ID NO:33:                                           -      (i) SEQUENCE CHARACTERISTICS:                                          #pairs    (A) LENGTH: 468 base                                                          (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                -     (ii) MOLECULE TYPE: cDNA                                                -     (ix) FEATURE:                                                                     (A) NAME/KEY: CDS                                                             (B) LOCATION: 1..357                                                -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:33:                                - CGG GAC ACC AAG TTT AGG GAG GAC TGC CCG CC - #G GAT CGC GAG GAA CTG           48                                                                          Arg Asp Thr Lys Phe Arg Glu Asp Cys Pro Pr - #o Asp Arg Glu Glu Leu           #                 15                                                          - GGC CGC CAC AGC TGG GCT GTC CTC CAC ACC CT - #G GCC GCC TAC TAC CCC           96                                                                          Gly Arg His Ser Trp Ala Val Leu His Thr Le - #u Ala Ala Tyr Tyr Pro           #             30                                                              - GAC CTG CCC ACC CCA GAA CAG CAG CAA GAC AT - #G GCC CAG TTC ATA CAT          144                                                                          Asp Leu Pro Thr Pro Glu Gln Gln Gln Asp Me - #t Ala Gln Phe Ile His           #         45                                                                  - TTA TTT TCT AAG TTT TAC CCC TGT GAG GAG TG - #T GCT GAA GAC CTA AGA          192                                                                          Leu Phe Ser Lys Phe Tyr Pro Cys Glu Glu Cy - #s Ala Glu Asp Leu Arg           #     60                                                                      - AAA AGG CTG TGC AGG AAC CAC CCA GAC ACC CG - #C ACC CGG GCA TGC TTC          240                                                                          Lys Arg Leu Cys Arg Asn His Pro Asp Thr Ar - #g Thr Arg Ala Cys Phe           # 80                                                                          - ACA CAG TGG CTG TGC CAC CTG CAC AAT GAA GT - #G AAC CGC AAG CTG GGC          288                                                                          Thr Gln Trp Leu Cys His Leu His Asn Glu Va - #l Asn Arg Lys Leu Gly           #                 95                                                          - AAG CCT GAC TTC GAC TGC TCA AAA GTG GAT GA - #G CGC TGG CGC GAC GGC          336                                                                          Lys Pro Asp Phe Asp Cys Ser Lys Val Asp Gl - #u Arg Trp Arg Asp Gly           #           110                                                               - TGG AAG GAT GGC TCC TGT GAC TAGAGGGTGG TCAGCCAGA - #G CTCATGGGAC             387                                                                          Trp Lys Asp Gly Ser Cys Asp                                                           115                                                                   - AGCTAGCCAG GCATGGTTGG ATAGGGGCAG GGCACTCATT AAAGTGCATC AC - #AGCCAGAA        447                                                                          #                 468AA A                                                     - (2) INFORMATION FOR SEQ ID NO:34:                                           -      (i) SEQUENCE CHARACTERISTICS:                                          #acids    (A) LENGTH: 119 amino                                                         (B) TYPE: amino acid                                                          (D) TOPOLOGY: linear                                                -     (ii) MOLECULE TYPE: protein                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:34:                                - Arg Asp Thr Lys Phe Arg Glu Asp Cys Pro Pr - #o Asp Arg Glu Glu Leu         #                 15                                                          - Gly Arg His Ser Trp Ala Val Leu His Thr Le - #u Ala Ala Tyr Tyr Pro         #             30                                                              - Asp Leu Pro Thr Pro Glu Gln Gln Gln Asp Me - #t Ala Gln Phe Ile His         #         45                                                                  - Leu Phe Ser Lys Phe Tyr Pro Cys Glu Glu Cy - #s Ala Glu Asp Leu Arg         #     60                                                                      - Lys Arg Leu Cys Arg Asn His Pro Asp Thr Ar - #g Thr Arg Ala Cys Phe         # 80                                                                          - Thr Gln Trp Leu Cys His Leu His Asn Glu Va - #l Asn Arg Lys Leu Gly         #                 95                                                          - Lys Pro Asp Phe Asp Cys Ser Lys Val Asp Gl - #u Arg Trp Arg Asp Gly         #           110                                                               - Trp Lys Asp Gly Ser Cys Asp                                                         115                                                                   - (2) INFORMATION FOR SEQ ID NO:35:                                           -      (i) SEQUENCE CHARACTERISTICS:                                          #acids    (A) LENGTH: 125 amino                                                         (B) TYPE: amino acid                                                          (C) STRANDEDNESS: Not R - #elevant                                            (D) TOPOLOGY: unknown                                               -     (ii) MOLECULE TYPE: protein                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:35:                                - Met Arg Thr Gln Gln Lys Arg Asp Ile Lys Ph - #e Arg Glu Asp Cys Pro         #                15                                                           - Gln Asp Arg Glu Glu Leu Gly Arg Asn Thr Tr - #p Ala Phe Leu His Thr         #            30                                                               - Leu Ala Ala Tyr Tyr Pro Asp Met Pro Thr Pr - #o Glu Gln Gln Gln Asp         #        45                                                                   - Met Ala Gln Phe Ile His Ile Phe Ser Lys Ph - #e Tyr Pro Cys Glu Glu         #    60                                                                       - Cys Ala Glu Asp Ile Arg Lys Arg Ile Asp Ar - #g Ser Gln Pro Asp Thr         #80                                                                           - Ser Thr Arg Val Ser Phe Ser Gln Trp Leu Cy - #s Arg Leu His Asn Glu         #                95                                                           - Val Asn Arg Lys Leu Gly Lys Pro Asp Phe As - #p Cys Ser Arg Val Asp         #           110                                                               - Glu Arg Trp Arg Asp Gly Trp Lys Asp Gly Se - #r Cys Asp                     #       125                                                                   - (2) INFORMATION FOR SEQ ID NO:36:                                           -      (i) SEQUENCE CHARACTERISTICS:                                          #pairs    (A) LENGTH: 20 base                                                           (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                -     (ii) MOLECULE TYPE: other nucleic acid                                  #= "oligonucleotide primer"/desc                                              -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:36:                                # 20               CAGT                                                       - (2) INFORMATION FOR SEQ ID NO:37:                                           -      (i) SEQUENCE CHARACTERISTICS:                                          #pairs    (A) LENGTH: 20 base                                                           (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                -     (ii) MOLECULE TYPE: other nucleic acid                                  #= "oligonucleotide primer"/desc                                              -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:37:                                # 20               TCCT                                                       - (2) INFORMATION FOR SEQ ID NO:38:                                           -      (i) SEQUENCE CHARACTERISTICS:                                          #pairs    (A) LENGTH: 20 base                                                           (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                -     (ii) MOLECULE TYPE: other nucleic acid                                  #= "oligonucleotide primer"/desc                                              -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:38:                                # 20               CTAC                                                       - (2) INFORMATION FOR SEQ ID NO:39:                                           -      (i) SEQUENCE CHARACTERISTICS:                                          #pairs    (A) LENGTH: 20 base                                                           (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                -     (ii) MOLECULE TYPE: other nucleic acid                                  #= "oligonucleotide primer"/desc                                              -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:39:                                # 20               AACA                                                       - (2) INFORMATION FOR SEQ ID NO:40:                                           -      (i) SEQUENCE CHARACTERISTICS:                                          #pairs    (A) LENGTH: 18 base                                                           (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                -     (ii) MOLECULE TYPE: other nucleic acid                                  #= "oligonucleotide primer"/desc                                              -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:40:                                #  18              GT                                                         - (2) INFORMATION FOR SEQ ID NO:41:                                           -      (i) SEQUENCE CHARACTERISTICS:                                          #pairs    (A) LENGTH: 18 base                                                           (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                -     (ii) MOLECULE TYPE: other nucleic acid                                  #= "oligonucleotide primer"/desc                                              -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:41:                                #  18              CA                                                         - (2) INFORMATION FOR SEQ ID NO:42:                                           -      (i) SEQUENCE CHARACTERISTICS:                                          #pairs    (A) LENGTH: 17 base                                                           (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                -     (ii) MOLECULE TYPE: other nucleic acid                                  #= "oligonucleotide primer"/desc                                              -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:42:                                #   17             C                                                          - (2) INFORMATION FOR SEQ ID NO:43:                                           -      (i) SEQUENCE CHARACTERISTICS:                                          #pairs    (A) LENGTH: 17 base                                                           (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                -     (ii) MOLECULE TYPE: other nucleic acid                                  #= "oligonucleotide primer"/desc                                              -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:43:                                #   17             T                                                          - (2) INFORMATION FOR SEQ ID NO:44:                                           -      (i) SEQUENCE CHARACTERISTICS:                                          #pairs    (A) LENGTH: 20 base                                                           (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                -     (ii) MOLECULE TYPE: other nucleic acid                                  #= "oligonucleotide primer"/desc                                              -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:44:                                # 20               ACTG                                                       - (2) INFORMATION FOR SEQ ID NO:45:                                           -      (i) SEQUENCE CHARACTERISTICS:                                          #pairs    (A) LENGTH: 20 base                                                           (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                -     (ii) MOLECULE TYPE: other nucleic acid                                  #= "oligonucleotide primer"/desc                                              -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:45:                                # 20               GATT                                                       - (2) INFORMATION FOR SEQ ID NO:46:                                           -      (i) SEQUENCE CHARACTERISTICS:                                          #pairs    (A) LENGTH: 20 base                                                           (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                -     (ii) MOLECULE TYPE: other nucleic acid                                  #= "oligonucleotide primer"/desc                                              -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:46:                                # 20               TCTG                                                       - (2) INFORMATION FOR SEQ ID NO:47:                                           -      (i) SEQUENCE CHARACTERISTICS:                                          #pairs    (A) LENGTH: 20 base                                                           (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                -     (ii) MOLECULE TYPE: other nucleic acid                                  #= "oligonucleotide primer"/desc                                              -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:47:                                # 20               GACA                                                       - (2) INFORMATION FOR SEQ ID NO:48:                                           -      (i) SEQUENCE CHARACTERISTICS:                                          #pairs    (A) LENGTH: 17 base                                                           (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                -     (ii) MOLECULE TYPE: other nucleic acid                                  #= "oligonucleotide primer"/desc                                              -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:48:                                #   17             C                                                          - (2) INFORMATION FOR SEQ ID NO:49:                                           -      (i) SEQUENCE CHARACTERISTICS:                                          #pairs    (A) LENGTH: 17 base                                                           (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                -     (ii) MOLECULE TYPE: other nucleic acid                                  #= "oligonucleotide primer"/desc                                              -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:49:                                #   17             A                                                          - (2) INFORMATION FOR SEQ ID NO:50:                                           -      (i) SEQUENCE CHARACTERISTICS:                                          #pairs    (A) LENGTH: 20 base                                                           (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                -     (ii) MOLECULE TYPE: other nucleic acid                                  #= "oligonucleotide primer"/desc                                              -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:50:                                # 20               TCTC                                                       - (2) INFORMATION FOR SEQ ID NO:51:                                           -      (i) SEQUENCE CHARACTERISTICS:                                          #pairs    (A) LENGTH: 20 base                                                           (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                -     (ii) MOLECULE TYPE: other nucleic acid                                  #= "oligonucleotide  primer"desc                                              -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:51:                                # 20               ATAT                                                       - (2) INFORMATION FOR SEQ ID NO:52:                                           -      (i) SEQUENCE CHARACTERISTICS:                                          #pairs    (A) LENGTH: 19 base                                                           (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                -     (ii) MOLECULE TYPE: other nucleic acid                                  #= "oligonucleotide primer"/desc                                              -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:52:                                # 19               GTG                                                        - (2) INFORMATION FOR SEQ ID NO:53:                                           -      (i) SEQUENCE CHARACTERISTICS:                                          #pairs    (A) LENGTH: 19 base                                                           (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                -     (ii) MOLECULE TYPE: other nucleic acid                                  #= "oligonucleotide primer"/desc                                              -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:53:                                # 19               ATG                                                        - (2) INFORMATION FOR SEQ ID NO:54:                                           -      (i) SEQUENCE CHARACTERISTICS:                                          #pairs    (A) LENGTH: 18 base                                                           (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                -     (ii) MOLECULE TYPE: other nucleic acid                                  #= "oligonucleotide primer"/desc                                              -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:54:                                #  18              CA                                                         - (2) INFORMATION FOR SEQ ID NO:55:                                           -      (i) SEQUENCE CHARACTERISTICS:                                          #pairs    (A) LENGTH: 17 base                                                           (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                -     (ii) MOLECULE TYPE: other nucleic acid                                  #= "oligonucleotide primer"/desc                                              -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:55:                                #   17             C                                                          - (2) INFORMATION FOR SEQ ID NO:56:                                           -      (i) SEQUENCE CHARACTERISTICS:                                          #pairs    (A) LENGTH: 17 base                                                           (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                -     (ii) MOLECULE TYPE: other nucleic acid                                  #= "oligonucleotide primer"/desc                                              -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:56:                                #   17             A                                                          - (2) INFORMATION FOR SEQ ID NO:57:                                           -      (i) SEQUENCE CHARACTERISTICS:                                          #pairs    (A) LENGTH: 17 base                                                           (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                -     (ii) MOLECULE TYPE: other nucleic acid                                  #= "oligonucleotide primer"/desc                                              -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:57:                                #   17             C                                                          - (2) INFORMATION FOR SEQ ID NO:58:                                           -      (i) SEQUENCE CHARACTERISTICS:                                          #pairs    (A) LENGTH: 20 base                                                           (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                -     (ii) MOLECULE TYPE: other nucleic acid                                  #= "oligonucleotide primer"/desc                                              -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:58:                                # 20               GCAT                                                       - (2) INFORMATION FOR SEQ ID NO:59:                                           -      (i) SEQUENCE CHARACTERISTICS:                                          #pairs    (A) LENGTH: 17 base                                                           (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                -     (ii) MOLECULE TYPE: other nucleic acid                                  #= "oligonucleotide primer"/desc                                              -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:59:                                #   17             G                                                          - (2) INFORMATION FOR SEQ ID NO:60:                                           -      (i) SEQUENCE CHARACTERISTICS:                                          #pairs    (A) LENGTH: 17 base                                                           (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                -     (ii) MOLECULE TYPE: other nucleic acid                                  #= "oligonucleotide primer"/desc                                              -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:60:                                #   17             A                                                          - (2) INFORMATION FOR SEQ ID NO:61:                                           -      (i) SEQUENCE CHARACTERISTICS:                                          #pairs    (A) LENGTH: 17 base                                                           (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                -     (ii) MOLECULE TYPE: other nucleic acid                                  #= "oligonucleotide primer"/desc                                              -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:61:                                #   17             C                                                          - (2) INFORMATION FOR SEQ ID NO:62:                                           -      (i) SEQUENCE CHARACTERISTICS:                                          #pairs    (A) LENGTH: 17 base                                                           (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                -     (ii) MOLECULE TYPE: other nucleic acid                                  #= "oligonucleotide primer"/desc                                              -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:62:                                #   17             G                                                          - (2) INFORMATION FOR SEQ ID NO:63:                                           -      (i) SEQUENCE CHARACTERISTICS:                                          #pairs    (A) LENGTH: 20 base                                                           (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                -     (ii) MOLECULE TYPE: other nucleic acid                                  #= "oligonucleotide primer"/desc                                              -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:63:                                # 20               CTGC                                                       - (2) INFORMATION FOR SEQ ID NO:64:                                           -      (i) SEQUENCE CHARACTERISTICS:                                          #pairs    (A) LENGTH: 20 base                                                           (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                -     (ii) MOLECULE TYPE: other nucleic acid                                  #= "oligonucleotide primer"/desc                                              -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:64:                                # 20               GCAT                                                       - (2) INFORMATION FOR SEQ ID NO:65:                                           -      (i) SEQUENCE CHARACTERISTICS:                                          #pairs    (A) LENGTH: 20 base                                                           (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                -     (ii) MOLECULE TYPE: other nucleic acid                                  #= "oligonucleotide primer"/desc                                              -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:65:                                # 20               CCGC                                                       - (2) INFORMATION FOR SEQ ID NO:66:                                           -      (i) SEQUENCE CHARACTERISTICS:                                          #pairs    (A) LENGTH: 20 base                                                           (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                -     (ii) MOLECULE TYPE: other nucleic acid                                  #= "oligonucleotide primer"/desc                                              -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:66:                                # 20               CCCA                                                       - (2) INFORMATION FOR SEQ ID NO:67:                                           -      (i) SEQUENCE CHARACTERISTICS:                                          #pairs    (A) LENGTH: 18 base                                                           (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                -     (ii) MOLECULE TYPE: other nucleic acid                                  #= "oligonucleotide primer"/desc                                              -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:67:                                #  18              CT                                                         - (2) INFORMATION FOR SEQ ID NO:68:                                           -      (i) SEQUENCE CHARACTERISTICS:                                                    (A) LENGTH: 9 base p - #airs                                                  (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                -     (ii) MOLECULE TYPE: other nucleic acid                                  #= "Kozak Initiation Sequence"sc                                              -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:68:                                #          9                                                                  - (2) INFORMATION FOR SEQ ID NO:69:                                           -      (i) SEQUENCE CHARACTERISTICS:                                          #pairs    (A) LENGTH: 20 base                                                           (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                -     (ii) MOLECULE TYPE: other nucleic acid                                  #= "oligonucleotide primer"/desc                                              -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:69:                                # 20               TTTA                                                       - (2) INFORMATION FOR SEQ ID NO:70:                                           -      (i) SEQUENCE CHARACTERISTICS:                                          #pairs    (A) LENGTH: 20 base                                                           (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                -     (ii) MOLECULE TYPE: other nucleic acid                                  #= "oligonucleotide primer"/desc                                              -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:70:                                # 20               TGTG                                                       - (2) INFORMATION FOR SEQ ID NO:71:                                           -      (i) SEQUENCE CHARACTERISTICS:                                          #acids    (A) LENGTH: 8 amino                                                           (B) TYPE: amino acid                                                          (C) STRANDEDNESS: Not R - #elevant                                            (D) TOPOLOGY: unknown                                               -     (ii) MOLECULE TYPE: peptide                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:71:                                - His Arg Asp Leu Lys Pro Glu Asn                                             1               5                                                             - (2) INFORMATION FOR SEQ ID NO:72:                                           -      (i) SEQUENCE CHARACTERISTICS:                                          #pairs    (A) LENGTH: 17 base                                                           (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                -     (ii) MOLECULE TYPE: other nucleic acid                                  #= "oligonucleotide primer"/desc                                              -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:72:                                #   17             T                                                          - (2) INFORMATION FOR SEQ ID NO:73:                                           -      (i) SEQUENCE CHARACTERISTICS:                                          #pairs    (A) LENGTH: 20 base                                                           (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                -     (ii) MOLECULE TYPE: other nucleic acid                                  #= "oligonucleotide primer"/desc                                              -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:73:                                # 20               GAGG                                                       - (2) INFORMATION FOR SEQ ID NO:74:                                           -      (i) SEQUENCE CHARACTERISTICS:                                          #pairs    (A) LENGTH: 6525 base                                                         (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                -     (ii) MOLECULE TYPE: cDNA                                                -     (ix) FEATURE:                                                                     (A) NAME/KEY: CDS                                                             (B) LOCATION: 573..5684                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:74:                                - CACATAAAAT ACACCGCCCC GGCGCCCAGG CTCGGTGCTG GAGAGTCATG CC - #TGTGAGCC         60                                                                          - CTGGGCACCT CCTGATGTCC TGCGAGGTCA CGGTGTTCCC AAACCTCAGG GT - #TGCCCTGC        120                                                                          - CCCACTCCAG AGGCTCTCAG GCCCCACCCC GGAGCCCTCT GTGCGGAGCC GC - #CTCCTCCT        180                                                                          - GGCCAGTTCC CCAGTAGTCC TGAAGGGAGA CCTGCTGTGT GGAGCCTCTT CT - #GGGACCCA        240                                                                          - GCCATGAGTG TGGAGCTGAG CAACTGAACC TGAAACTCTT CCACTGTGAG TC - #AAGGAGGC        300                                                                          - TTTTCCGCAC ATGAAGGACG CTGAGCGGGA AGGACTCCTC TCTGCCTGCA GT - #TGTAGCGA        360                                                                          - GTGGACCAGC ACCAGGGGCT CTCTAGACTG CCCCTCCTCC ATCGCCTTCC CT - #GCCTCTCC        420                                                                          - AGGACAGAGC AGCCACGTCT GCACACCTCG CCCTCTTTAC ACTCAGTTTT CA - #GAGCACGT        480                                                                          - TTCTCCTATT TCCTGCGGGT TGCAGCGCCT ACTTGAACTT ACTCAGACCA CC - #TACTTCTC        540                                                                          - TAGCAGCACT GGGCGTCCCT TTCAGCAAGA CG ATG GCT GTG CTC - # AGG CAG CTG          593                                                                          #Met Ala Val Leu Arg Gln Leu                                                  #  1               5                                                          - GCG CTC CTC CTC TGG AAG AAC TAC ACC CTG CA - #G AAG CGG AAG GTC CTG          641                                                                          Ala Leu Leu Leu Trp Lys Asn Tyr Thr Leu Gl - #n Lys Arg Lys Val Leu           #         20                                                                  - GTG ACG GTC CTG GAA CTC TTC CTG CCA TTG CT - #G TTT TCT GGG ATC CTC          689                                                                          Val Thr Val Leu Glu Leu Phe Leu Pro Leu Le - #u Phe Ser Gly Ile Leu           #     35                                                                      - ATC TGG CTC CGC TTG AAG ATT CAG TCG GAA AA - #T GTG CCC AAC GCC ACC          737                                                                          Ile Trp Leu Arg Leu Lys Ile Gln Ser Glu As - #n Val Pro Asn Ala Thr           # 55                                                                          - ATC TAC CCG GGC CAG TCC ATC CAG GAG CTG CC - #T CTG TTC TTC ACC TTC          785                                                                          Ile Tyr Pro Gly Gln Ser Ile Gln Glu Leu Pr - #o Leu Phe Phe Thr Phe           #                 70                                                          - CCT CCG CCA GGA GAC ACC TGG GAG CTT GCC TA - #C ATC CCT TCT CAC AGT          833                                                                          Pro Pro Pro Gly Asp Thr Trp Glu Leu Ala Ty - #r Ile Pro Ser His Ser           #             85                                                              - GAC GCT GCC AAG GCC GTC ACT GAG ACA GTG CG - #C AGG GCA CTT GTG ATC          881                                                                          Asp Ala Ala Lys Ala Val Thr Glu Thr Val Ar - #g Arg Ala Leu Val Ile           #        100                                                                  - AAC ATG CGA GTG CGC GGC TTT CCC TCC GAG AA - #G GAC TTT GAG GAC TAC          929                                                                          Asn Met Arg Val Arg Gly Phe Pro Ser Glu Ly - #s Asp Phe Glu Asp Tyr           #   115                                                                       - ATT AGG TAC GAC AAC TGC TCG TCC AGC GTG CT - #G GCC GCC GTG GTC TTC          977                                                                          Ile Arg Tyr Asp Asn Cys Ser Ser Ser Val Le - #u Ala Ala Val Val Phe           120                 1 - #25                 1 - #30                 1 -       #35                                                                           - GAG CAC CCC TTC AAC CAC AGC AAG GAG CCC CT - #G CCG CTG GCG GTG AAA         1025                                                                          Glu His Pro Phe Asn His Ser Lys Glu Pro Le - #u Pro Leu Ala Val Lys           #               150                                                           - TAT CAC CTA CGG TTC AGT TAC ACA CGG AGA AA - #T TAC ATG TGG ACC CAA         1073                                                                          Tyr His Leu Arg Phe Ser Tyr Thr Arg Arg As - #n Tyr Met Trp Thr Gln           #           165                                                               - ACA GGC TCC TTT TTC CTG AAA GAG ACA GAA GG - #C TGG CAC ACT ACT TCC         1121                                                                          Thr Gly Ser Phe Phe Leu Lys Glu Thr Glu Gl - #y Trp His Thr Thr Ser           #       180                                                                   - CTT TTC CCG CTT TTC CCA AAC CCA GGA CCA AG - #G GAA CTA ACA TCC CCT         1169                                                                          Leu Phe Pro Leu Phe Pro Asn Pro Gly Pro Ar - #g Glu Leu Thr Ser Pro           #   195                                                                       - GAT GGC GGA GAA CCT GGG TAC ATC CGG GAA GG - #C TTC CTG GCC GTG CAG         1217                                                                          Asp Gly Gly Glu Pro Gly Tyr Ile Arg Glu Gl - #y Phe Leu Ala Val Gln           200                 2 - #05                 2 - #10                 2 -       #15                                                                           - CAT GCT GTG GAC CGG GCC ATC ATG GAG TAC CA - #T GCC GAT GCC GCC ACA         1265                                                                          His Ala Val Asp Arg Ala Ile Met Glu Tyr Hi - #s Ala Asp Ala Ala Thr           #               230                                                           - CGC CAG CTG TTC CAG AGA CTG ACG GTG ACC AT - #C AAG AGG TTC CCG TAC         1313                                                                          Arg Gln Leu Phe Gln Arg Leu Thr Val Thr Il - #e Lys Arg Phe Pro Tyr           #           245                                                               - CCG CCG TTC ATC GCA GAC CCC TTC CTC GTG GC - #C ATC CAG TAC CAG CTG         1361                                                                          Pro Pro Phe Ile Ala Asp Pro Phe Leu Val Al - #a Ile Gln Tyr Gln Leu           #       260                                                                   - CCC CTG CTG CTG CTG CTC AGC TTC ACC TAC AC - #C GCG CTC ACC ATT GCC         1409                                                                          Pro Leu Leu Leu Leu Leu Ser Phe Thr Tyr Th - #r Ala Leu Thr Ile Ala           #   275                                                                       - CGT GCT GTC GTG CAG GAG AAG GAA AGG AGG CT - #G AAG GAG TAC ATG CGC         1457                                                                          Arg Ala Val Val Gln Glu Lys Glu Arg Arg Le - #u Lys Glu Tyr Met Arg           280                 2 - #85                 2 - #90                 2 -       #95                                                                           - ATG ATG GGG CTC AGC AGC TGG CTG CAC TGG AG - #T GCC TGG TTC CTC TTG         1505                                                                          Met Met Gly Leu Ser Ser Trp Leu His Trp Se - #r Ala Trp Phe Leu Leu           #               310                                                           - TTC TTC CTC TTC CTC CTC ATC GCC GCC TCC TT - #C ATG ACC CTG CTC TTC         1553                                                                          Phe Phe Leu Phe Leu Leu Ile Ala Ala Ser Ph - #e Met Thr Leu Leu Phe           #           325                                                               - TGT GTC AAG GTG AAG CCA AAT GTA GCC GTG CT - #G TCC CGC AGC GAC CCC         1601                                                                          Cys Val Lys Val Lys Pro Asn Val Ala Val Le - #u Ser Arg Ser Asp Pro           #       340                                                                   - TCC CTG GTG CTC GCC TTC CTG CTG TGC TTC GC - #C ATC TCT ACC ATC TCC         1649                                                                          Ser Leu Val Leu Ala Phe Leu Leu Cys Phe Al - #a Ile Ser Thr Ile Ser           #   355                                                                       - TTC AGC TTC ATG GTC AGC ACC TTC TTC AGC AA - #A GCC AAC ATG GCA GCA         1697                                                                          Phe Ser Phe Met Val Ser Thr Phe Phe Ser Ly - #s Ala Asn Met Ala Ala           360                 3 - #65                 3 - #70                 3 -       #75                                                                           - GCC TTC GGA GGC TTC CTC TAC TTC TTC ACC TA - #C ATC CCC TAC TTC TTC         1745                                                                          Ala Phe Gly Gly Phe Leu Tyr Phe Phe Thr Ty - #r Ile Pro Tyr Phe Phe           #               390                                                           - GTG GCC CCT CGG TAC AAC TGG ATG ACT CTG AG - #C CAG AAG CTC TGC TCC         1793                                                                          Val Ala Pro Arg Tyr Asn Trp Met Thr Leu Se - #r Gln Lys Leu Cys Ser           #           405                                                               - TGC CTC CTG TCT AAT GTC GCC ATG GCA ATG GG - #A GCC CAG CTC ATT GGG         1841                                                                          Cys Leu Leu Ser Asn Val Ala Met Ala Met Gl - #y Ala Gln Leu Ile Gly           #       420                                                                   - AAA TTT GAG GCG AAA GGC ATG GGC ATC CAG TG - #G CGA GAC CTC CTG AGT         1889                                                                          Lys Phe Glu Ala Lys Gly Met Gly Ile Gln Tr - #p Arg Asp Leu Leu Ser           #   435                                                                       - CCC GTC AAC GTG GAC GAC GAC TTC TGC TTC GG - #G CAG GTG CTG GGG ATG         1937                                                                          Pro Val Asn Val Asp Asp Asp Phe Cys Phe Gl - #y Gln Val Leu Gly Met           440                 4 - #45                 4 - #50                 4 -       #55                                                                           - CTG CTG CTG GAC TCT GTG CTC TAT GGC CTG GT - #G ACC TGG TAC ATG GAG         1985                                                                          Leu Leu Leu Asp Ser Val Leu Tyr Gly Leu Va - #l Thr Trp Tyr Met Glu           #               470                                                           - GCC GTC TTC CCA GGG CAG TTC GGC GTG CCT CA - #G CCC TGG TAC TTC TTC         2033                                                                          Ala Val Phe Pro Gly Gln Phe Gly Val Pro Gl - #n Pro Trp Tyr Phe Phe           #           485                                                               - ATC ATG CCC TCC TAT TGG TGT GGG AAG CCA AG - #G GCG GTT GCA GGG AAG         2081                                                                          Ile Met Pro Ser Tyr Trp Cys Gly Lys Pro Ar - #g Ala Val Ala Gly Lys           #       500                                                                   - GAG GAA GAA GAC AGT GAC CCC GAG AAA GCA CT - #C AGA AAC GAG TAC TTT         2129                                                                          Glu Glu Glu Asp Ser Asp Pro Glu Lys Ala Le - #u Arg Asn Glu Tyr Phe           #   515                                                                       - GAA GCC GAG CCA GAG GAC CTG GTG GCG GGG AT - #C AAG ATC AAG CAC CTG         2177                                                                          Glu Ala Glu Pro Glu Asp Leu Val Ala Gly Il - #e Lys Ile Lys His Leu           520                 5 - #25                 5 - #30                 5 -       #35                                                                           - TCC AAG GTG TTC AGG GTG GGA AAT AAG GAC AG - #G GCG GCC GTC AGA GAC         2225                                                                          Ser Lys Val Phe Arg Val Gly Asn Lys Asp Ar - #g Ala Ala Val Arg Asp           #               550                                                           - CTG AAC CTC AAC CTG TAC GAG GGA CAG ATC AC - #C GTC CTG CTG GGC CAC         2273                                                                          Leu Asn Leu Asn Leu Tyr Glu Gly Gln Ile Th - #r Val Leu Leu Gly His           #           565                                                               - AAC GGT GCC GGG AAG ACC ACC ACC CTC TCC AT - #G CTC ACA GGT CTC TTT         2321                                                                          Asn Gly Ala Gly Lys Thr Thr Thr Leu Ser Me - #t Leu Thr Gly Leu Phe           #       580                                                                   - CCC CCC ACC AGT GGA CGG GCA TAC ATC AGC GG - #G TAT GAA ATT TCC CAG         2369                                                                          Pro Pro Thr Ser Gly Arg Ala Tyr Ile Ser Gl - #y Tyr Glu Ile Ser Gln           #   595                                                                       - GAC ATG GTT CAG ATC CGG AAG AGC CTG GGC CT - #G TGC CCG CAG CAC GAC         2417                                                                          Asp Met Val Gln Ile Arg Lys Ser Leu Gly Le - #u Cys Pro Gln His Asp           600                 6 - #05                 6 - #10                 6 -       #15                                                                           - ATC CTG TTT GAC AAC TTG ACA GTC GCA GAG CA - #C CTT TAT TTC TAC GCC         2465                                                                          Ile Leu Phe Asp Asn Leu Thr Val Ala Glu Hi - #s Leu Tyr Phe Tyr Ala           #               630                                                           - CAG CTG AAG GGC CTG TCA CGT CAG AAG TGC CC - #T GAA GAA GTC AAG CAG         2513                                                                          Gln Leu Lys Gly Leu Ser Arg Gln Lys Cys Pr - #o Glu Glu Val Lys Gln           #           645                                                               - ATG CTG CAC ATC ATC GGC CTG GAG GAC AAG TG - #G AAC TCA CGG AGC CGC         2561                                                                          Met Leu His Ile Ile Gly Leu Glu Asp Lys Tr - #p Asn Ser Arg Ser Arg           #       660                                                                   - TTC CTG AGC GGG GGC ATG AGG CGC AAG CTC TC - #C ATC GGC ATC GCC CTC         2609                                                                          Phe Leu Ser Gly Gly Met Arg Arg Lys Leu Se - #r Ile Gly Ile Ala Leu           #   675                                                                       - ATC GCA GGC TCC AAG GTG CTG ATA CTG GAC GA - #G CCC ACC TCG GGC ATG         2657                                                                          Ile Ala Gly Ser Lys Val Leu Ile Leu Asp Gl - #u Pro Thr Ser Gly Met           680                 6 - #85                 6 - #90                 6 -       #95                                                                           - GAC GCC ATC TCC AGG AGG GCC ATC TGG GAT CT - #T CTT CAG CGG CAG AAA         2705                                                                          Asp Ala Ile Ser Arg Arg Ala Ile Trp Asp Le - #u Leu Gln Arg Gln Lys           #               710                                                           - AGT GAC CGC ACC ATC GTG CTG ACC ACC CAC TT - #C ATG GAC GAG GCT GAC         2753                                                                          Ser Asp Arg Thr Ile Val Leu Thr Thr His Ph - #e Met Asp Glu Ala Asp           #           725                                                               - CTG CTG GGA GAC CGC ATC GCC ATC ATG GCC AA - #G GGG GAG CTG CAG TGC         2801                                                                          Leu Leu Gly Asp Arg Ile Ala Ile Met Ala Ly - #s Gly Glu Leu Gln Cys           #       740                                                                   - TGC GGG TCC TCG CTG TTC CTC AAG CAG AAA TA - #C GGT GCC GGC TAT CAC         2849                                                                          Cys Gly Ser Ser Leu Phe Leu Lys Gln Lys Ty - #r Gly Ala Gly Tyr His           #   755                                                                       - ATG ACG CTG GTG AAG GAG CCG CAC TGC AAC CC - #G GAA GAC ATC TCC CAG         2897                                                                          Met Thr Leu Val Lys Glu Pro His Cys Asn Pr - #o Glu Asp Ile Ser Gln           760                 7 - #65                 7 - #70                 7 -       #75                                                                           - CTG GTC CAC CAC CAC GTG CCC AAC GCC ACG CT - #G GAG AGC AGC GCT GGG         2945                                                                          Leu Val His His His Val Pro Asn Ala Thr Le - #u Glu Ser Ser Ala Gly           #               790                                                           - GCC GAG CTG TCT TTC ATC CTT CCC AGA GAG AG - #C ACG CAC AGG TTT GAA         2993                                                                          Ala Glu Leu Ser Phe Ile Leu Pro Arg Glu Se - #r Thr His Arg Phe Glu           #           805                                                               - GGT CTC TTT GCT AAA CTG GAG AAG AAG CAG AA - #A GAG CTG GGC ATT GCC         3041                                                                          Gly Leu Phe Ala Lys Leu Glu Lys Lys Gln Ly - #s Glu Leu Gly Ile Ala           #       820                                                                   - AGC TTT GGG GCA TCC ATC ACC ACC ATG GAG GA - #A GTC TTC CTT CGG GTC         3089                                                                          Ser Phe Gly Ala Ser Ile Thr Thr Met Glu Gl - #u Val Phe Leu Arg Val           #   835                                                                       - GGG AAG CTG GTG GAC AGC AGT ATG GAC ATC CA - #G GCC ATC CAG CTC CCT         3137                                                                          Gly Lys Leu Val Asp Ser Ser Met Asp Ile Gl - #n Ala Ile Gln Leu Pro           840                 8 - #45                 8 - #50                 8 -       #55                                                                           - GCC CTG CAG TAC CAG CAC GAG AGG CGC GCC AG - #C GAC TGG GCT GTG GAC         3185                                                                          Ala Leu Gln Tyr Gln His Glu Arg Arg Ala Se - #r Asp Trp Ala Val Asp           #               870                                                           - AGC AAC CTC TGT GGG GCC ATG GAC CCC TCC GA - #C GGC ATT GGA GCC CTC         3233                                                                          Ser Asn Leu Cys Gly Ala Met Asp Pro Ser As - #p Gly Ile Gly Ala Leu           #           885                                                               - ATC GAG GAG GAG CGC ACC GCT GTC AAG CTC AA - #C ACT GGG CTC GCC CTG         3281                                                                          Ile Glu Glu Glu Arg Thr Ala Val Lys Leu As - #n Thr Gly Leu Ala Leu           #       900                                                                   - CAC TGC CAG CAA TTC TGG GCC ATG TTC CTG AA - #G AAG GCC GCA TAC AGC         3329                                                                          His Cys Gln Gln Phe Trp Ala Met Phe Leu Ly - #s Lys Ala Ala Tyr Ser           #   915                                                                       - TGG CGC GAG TGG AAA ATG GTG GCG GCA CAG GT - #C CTG GTG CCT CTG ACC         3377                                                                          Trp Arg Glu Trp Lys Met Val Ala Ala Gln Va - #l Leu Val Pro Leu Thr           920                 9 - #25                 9 - #30                 9 -       #35                                                                           - TGC GTC ACC CTG GCC CTC CTG GCC ATC AAC TA - #C TCC TCG GAG CTC TTC         3425                                                                          Cys Val Thr Leu Ala Leu Leu Ala Ile Asn Ty - #r Ser Ser Glu Leu Phe           #               950                                                           - GAC GAC CCC ATG CTG AGG CTG ACC TTG GGC GA - #G TAC GGC AGA ACC GTC         3473                                                                          Asp Asp Pro Met Leu Arg Leu Thr Leu Gly Gl - #u Tyr Gly Arg Thr Val           #           965                                                               - GTG CCC TTC TCA GTT CCC GGG ACC TCC CAG CT - #G GGT CAG CAG CTG TCA         3521                                                                          Val Pro Phe Ser Val Pro Gly Thr Ser Gln Le - #u Gly Gln Gln Leu Ser           #       980                                                                   - GAG CAT CTG AAA GAC GCA CTG CAG GCT GAG GG - #A CAG GAG CCC CGC GAG         3569                                                                          Glu His Leu Lys Asp Ala Leu Gln Ala Glu Gl - #y Gln Glu Pro Arg Glu           #   995                                                                       - GTG CTC GGT GAC CTG GAG GAG TTC TTG ATC TT - #C AGG GCT TCT GTG GAG         3617                                                                          Val Leu Gly Asp Leu Glu Glu Phe Leu Ile Ph - #e Arg Ala Ser Val Glu           #               10151005 - #                1010                              - GGG GGC GGC TTT AAT GAG CGG TGC CTT GTG GC - #A GCG TCC TTC AGA GAT         3665                                                                          Gly Gly Gly Phe Asn Glu Arg Cys Leu Val Al - #a Ala Ser Phe Arg Asp           #              10305                                                          - GTG GGA GAG CGC ACG GTC GTC AAC GCC TTG TT - #C AAC AAC CAG GCG TAC         3713                                                                          Val Gly Glu Arg Thr Val Val Asn Ala Leu Ph - #e Asn Asn Gln Ala Tyr           #          10450                                                              - CAC TCT CCA GCC ACT GCC CTG GCC GTC GTG GA - #C AAC CTT CTG TTC AAG         3761                                                                          His Ser Pro Ala Thr Ala Leu Ala Val Val As - #p Asn Leu Leu Phe Lys           #      10605                                                                  - CTG CTG TGC GGG CCT CAC GCC TCC ATT GTG GT - #C TCC AAC TTC CCC CAG         3809                                                                          Leu Leu Cys Gly Pro His Ala Ser Ile Val Va - #l Ser Asn Phe Pro Gln           #  10750                                                                      - CCC CGG AGC GCC CTG CAG GCT GCC AAG GAC CA - #G TTT AAC GAG GGC CGG         3857                                                                          Pro Arg Ser Ala Leu Gln Ala Ala Lys Asp Gl - #n Phe Asn Glu Gly Arg           #               10951085 - #                1090                              - AAG GGA TTC GAC ATT GCC CTC AAC CTG CTC TT - #C GCC ATG GCA TTC TTG         3905                                                                          Lys Gly Phe Asp Ile Ala Leu Asn Leu Leu Ph - #e Ala Met Ala Phe Leu           #              11105                                                          - GCC AGC ACG TTC TCC ATC CTG GCG GTC AGC GA - #G AGG GCC GTG CAG GCC         3953                                                                          Ala Ser Thr Phe Ser Ile Leu Ala Val Ser Gl - #u Arg Ala Val Gln Ala           #          11250                                                              - AAG CAT GTG CAG TTT GTG AGT GGA GTC CAC GT - #G GCC AGT TTC TGG CTC         4001                                                                          Lys His Val Gln Phe Val Ser Gly Val His Va - #l Ala Ser Phe Trp Leu           #      11405                                                                  - TCT GCT CTG CTG TGG GAC CTC ATC TCC TTC CT - #C ATC CCC AGT CTG CTG         4049                                                                          Ser Ala Leu Leu Trp Asp Leu Ile Ser Phe Le - #u Ile Pro Ser Leu Leu           #  11550                                                                      - CTG CTG GTG GTG TTT AAG GCC TTC GAC GTG CG - #T GCC TTC ACG CGG GAC         4097                                                                          Leu Leu Val Val Phe Lys Ala Phe Asp Val Ar - #g Ala Phe Thr Arg Asp           #               11751165 - #                1170                              - GGC CAC ATG GCT GAC ACC CTG CTG CTG CTC CT - #G CTC TAC GGC TGG GCC         4145                                                                          Gly His Met Ala Asp Thr Leu Leu Leu Leu Le - #u Leu Tyr Gly Trp Ala           #              11905                                                          - ATC ATC CCC CTC ATG TAC CTG ATG AAC TTC TT - #C TTC TTG GGG GCG GCC         4193                                                                          Ile Ile Pro Leu Met Tyr Leu Met Asn Phe Ph - #e Phe Leu Gly Ala Ala           #          12050                                                              - ACT GCC TAC ACG AGG CTG ACC ATC TTC AAC AT - #C CTG TCA GGC ATC GCC         4241                                                                          Thr Ala Tyr Thr Arg Leu Thr Ile Phe Asn Il - #e Leu Ser Gly Ile Ala           #      12205                                                                  - ACC TTC CTG ATG GTC ACC ATC ATG CGC ATC CC - #A GCT GTA AAA CTG GAA         4289                                                                          Thr Phe Leu Met Val Thr Ile Met Arg Ile Pr - #o Ala Val Lys Leu Glu           #  12350                                                                      - GAA CTT TCC AAA ACC CTG GAT CAC GTG TTC CT - #G GTG CTG CCC AAC CAC         4337                                                                          Glu Leu Ser Lys Thr Leu Asp His Val Phe Le - #u Val Leu Pro Asn His           #               12551245 - #                1250                              - TGT CTG GGG ATG GCA GTC AGC AGT TTC TAC GA - #G AAC TAC GAG ACG CGG         4385                                                                          Cys Leu Gly Met Ala Val Ser Ser Phe Tyr Gl - #u Asn Tyr Glu Thr Arg           #              12705                                                          - AGG TAC TGC ACC TCC TCC GAG GTC GCC GCC CA - #C TAC TGC AAG AAA TAT         4433                                                                          Arg Tyr Cys Thr Ser Ser Glu Val Ala Ala Hi - #s Tyr Cys Lys Lys Tyr           #          12850                                                              - AAC ATC CAG TAC CAG GAG AAC TTC TAT GCC TG - #G AGC GCC CCG GGG GTC         4481                                                                          Asn Ile Gln Tyr Gln Glu Asn Phe Tyr Ala Tr - #p Ser Ala Pro Gly Val           #      13005                                                                  - GGC CGG TTT GTG GCC TCC ATG GCC GCC TCA GG - #G TGC GCC TAC CTC ATC         4529                                                                          Gly Arg Phe Val Ala Ser Met Ala Ala Ser Gl - #y Cys Ala Tyr Leu Ile           #  13150                                                                      - CTG CTC TTC CTC ATC GAG ACC AAC CTG CTT CA - #G AGA CTC AGG GGC ATC         4577                                                                          Leu Leu Phe Leu Ile Glu Thr Asn Leu Leu Gl - #n Arg Leu Arg Gly Ile           #               13351325 - #                1330                              - CTC TGC GCC CTC CGG AGG AGG CGG ACA CTG AC - #A GAA TTA TAC ACC CGG         4625                                                                          Leu Cys Ala Leu Arg Arg Arg Arg Thr Leu Th - #r Glu Leu Tyr Thr Arg           #              13505                                                          - ATG CCT GTG CTT CCT GAG GAC CAA GAT GTA GC - #G GAC GAG AGG ACC CGC         4673                                                                          Met Pro Val Leu Pro Glu Asp Gln Asp Val Al - #a Asp Glu Arg Thr Arg           #          13650                                                              - ATC CTG GCC CCC AGC CCG GAC TCC CTG CTC CA - #C ACA CCT CTG ATT ATC         4721                                                                          Ile Leu Ala Pro Ser Pro Asp Ser Leu Leu Hi - #s Thr Pro Leu Ile Ile           #      13805                                                                  - AAG GAG CTC TCC AAG GTG TAC GAG CAG CGG GT - #G CCC CTC CTG GCC GTG         4769                                                                          Lys Glu Leu Ser Lys Val Tyr Glu Gln Arg Va - #l Pro Leu Leu Ala Val           #  13950                                                                      - GAC AGG CTC TCC CTC GCG GTG CAG AAA GGG GA - #G TGC TTC GGC CTG CTG         4817                                                                          Asp Arg Leu Ser Leu Ala Val Gln Lys Gly Gl - #u Cys Phe Gly Leu Leu           #               14151405 - #                1410                              - GGC TTC AAT GGA GCC GGG AAG ACC ACG ACT TT - #C AAA ATG CTG ACC GGG         4865                                                                          Gly Phe Asn Gly Ala Gly Lys Thr Thr Thr Ph - #e Lys Met Leu Thr Gly           #              14305                                                          - GAG GAG AGC CTC ACT TCT GGG GAT GCC TTT GT - #C GGG GGT CAC AGA ATC         4913                                                                          Glu Glu Ser Leu Thr Ser Gly Asp Ala Phe Va - #l Gly Gly His Arg Ile           #          14450                                                              - AGC TCT GAT GTC GGA AAG GTG CGG CAG CGG AT - #C GGC TAC TGC CCG CAG         4961                                                                          Ser Ser Asp Val Gly Lys Val Arg Gln Arg Il - #e Gly Tyr Cys Pro Gln           #      14605                                                                  - TTT GAT GCC TTG CTG GAC CAC ATG ACA GGC CG - #G GAG ATG CTG GTC ATG         5009                                                                          Phe Asp Ala Leu Leu Asp His Met Thr Gly Ar - #g Glu Met Leu Val Met           #  14750                                                                      - TAC GCT CGG CTC CGG GGC ATC CCT GAG CGC CA - #C ATC GGG GCC TGC GTG         5057                                                                          Tyr Ala Arg Leu Arg Gly Ile Pro Glu Arg Hi - #s Ile Gly Ala Cys Val           #               14951485 - #                1490                              - GAG AAC ACT CTG CGG GGC CTG CTG CTG GAG CC - #A CAT GCC AAC AAG CTG         5105                                                                          Glu Asn Thr Leu Arg Gly Leu Leu Leu Glu Pr - #o His Ala Asn Lys Leu           #              15105                                                          - GTC AGG ACG TAC AGT GGT GGT AAC AAG CGG AA - #G CTG AGC ACC GGC ATC         5153                                                                          Val Arg Thr Tyr Ser Gly Gly Asn Lys Arg Ly - #s Leu Ser Thr Gly Ile           #          15250                                                              - GCC CTG ATC GGA GAG CCT GCT GTC ATC TTC CT - #G GAC GAG CCG TCC ACT         5201                                                                          Ala Leu Ile Gly Glu Pro Ala Val Ile Phe Le - #u Asp Glu Pro Ser Thr           #      15405                                                                  - GGC ATG GAC CCC GTG GCC CGG CGC CTG CTT TG - #G GAC ACC GTG GCA CGA         5249                                                                          Gly Met Asp Pro Val Ala Arg Arg Leu Leu Tr - #p Asp Thr Val Ala Arg           #  15550                                                                      - GCC CGA GAG TCT GGC AAG GCC ATC ATC ATC AC - #C TCC CAC AGC ATG GAG         5297                                                                          Ala Arg Glu Ser Gly Lys Ala Ile Ile Ile Th - #r Ser His Ser Met Glu           #               15751565 - #                1570                              - GAG TGT GAG GCC CTG TGC ACC CGG CTG GCC AT - #C ATG GTG CAG GGG CAG         5345                                                                          Glu Cys Glu Ala Leu Cys Thr Arg Leu Ala Il - #e Met Val Gln Gly Gln           #              15905                                                          - TTC AAG TGC CTG GGC AGC CCC CAG CAC CTC AA - #G AGC AAG TTC GGC AGC         5393                                                                          Phe Lys Cys Leu Gly Ser Pro Gln His Leu Ly - #s Ser Lys Phe Gly Ser           #          16050                                                              - GGC TAC TCC CTG CGG GCC AAG GTG CAG AGT GA - #A GGG CAA CAG GAG GCG         5441                                                                          Gly Tyr Ser Leu Arg Ala Lys Val Gln Ser Gl - #u Gly Gln Gln Glu Ala           #      16205                                                                  - CTG GAG GAG TTC AAG GCC TTC GTG GAC CTG AC - #C TTT CCA GGC AGC GTC         5489                                                                          Leu Glu Glu Phe Lys Ala Phe Val Asp Leu Th - #r Phe Pro Gly Ser Val           #  16350                                                                      - CTG GAA GAT GAG CAC CAA GGC ATG GTC CAT TA - #C CAC CTG CCG GGC CGT         5537                                                                          Leu Glu Asp Glu His Gln Gly Met Val His Ty - #r His Leu Pro Gly Arg           #               16551645 - #                1650                              - GAC CTC AGC TGG GCG AAG GTT TTC GGT ATT CT - #G GAG AAA GCC AAG GAA         5585                                                                          Asp Leu Ser Trp Ala Lys Val Phe Gly Ile Le - #u Glu Lys Ala Lys Glu           #              16705                                                          - AAG TAC GGC GTG GAC GAC TAC TCC GTG AGC CA - #G ATC TCG CTG GAA CAG         5633                                                                          Lys Tyr Gly Val Asp Asp Tyr Ser Val Ser Gl - #n Ile Ser Leu Glu Gln           #          16850                                                              - GTC TTC CTG AGC TTC GCC CAC CTG CAG CCG CC - #C ACC GCA GAG GAG GGG         5681                                                                          Val Phe Leu Ser Phe Ala His Leu Gln Pro Pr - #o Thr Ala Glu Glu Gly           #      17005                                                                  - CGA TGAGGGGTGG CGGCTGTCTC GCCATCAGGC AGGGACAGGA CGGGCAAGC - #A              5734                                                                          Arg                                                                           - GGGCCCATCT TACATCCTCT CTCTCCAAGT TTATCTCATC CTTTATTTTT AA - #TCACTTTT       5794                                                                          - TTCTATGATG GATATGAAAA ATTCAAGGCA GTATGCACAG AATGGACGAG TG - #CAGCCCAG       5854                                                                          - CCCTCATGCC CAGGATCAGC ATGCGCATCT CCATGTCTGC ATACTCTGGA GT - #TCACTTTC       5914                                                                          - CCAGAGCTGG GGCAGGCCGG GCAGTCTGCG GGCAAGCTCC GGGGTCTCTG GG - #TGGAGAGC       5974                                                                          - TGACCCAGGA AGGGCTGCAG CTGAGCTGGG GGTTGAATTT CTCCAGGCAC TC - #CCTGGAGA       6034                                                                          - GAGGACCCAG TGACTTGTCC AAGTTTACAC ACGACACTAA TCTCCCCTGG GG - #AGGAAGCG       6094                                                                          - GGAAGCCAGC CAGGTTGAAC TGTAGCGAGG CCCCCAGGCC GCCAGGAATG GA - #CCATGCAG       6154                                                                          - ATCACTGTCA GTGGAGGGAA GCTGCTGACT GTGATTAGGT GCTGGGGTCT TA - #GCGTCCAG       6214                                                                          - CGCAGCCCGG GGGCATCCTG GAGGCTCTGC TCCTTAGGGC ATGGTAGTCA CC - #GCGAAGCC       6274                                                                          - GGGCACCGTC CCACAGCATC TCCTAGAAGC AGCCGGCACA GGAGGGAAGG TG - #GCCAGGCT       6334                                                                          - CGAAGCAGTC TCTGTTTCCA GCACTGCACC CTCAGGAAGT CGCCCGCCCC AG - #GACACGCA       6394                                                                          - GGGACCACCC TAAGGGCTGG GTGGCTGTCT CAAGGACACA TTGAATACGT TG - #TGACCATC       6454                                                                          - CAGAAAATAA ATGCTGAGGG GACACAAAAA AAAAAAAAAA AAAAAAAAAA AA - #AAAAAAAA       6514                                                                          #     6525                                                                    - (2) INFORMATION FOR SEQ ID NO:75:                                           -      (i) SEQUENCE CHARACTERISTICS:                                          #acids    (A) LENGTH: 1704 amino                                                        (B) TYPE: amino acid                                                          (D) TOPOLOGY: linear                                                -     (ii) MOLECULE TYPE: protein                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:75:                                - Met Ala Val Leu Arg Gln Leu Ala Leu Leu Le - #u Trp Lys Asn Tyr Thr         #                 15                                                          - Leu Gln Lys Arg Lys Val Leu Val Thr Val Le - #u Glu Leu Phe Leu Pro         #             30                                                              - Leu Leu Phe Ser Gly Ile Leu Ile Trp Leu Ar - #g Leu Lys Ile Gln Ser         #         45                                                                  - Glu Asn Val Pro Asn Ala Thr Ile Tyr Pro Gl - #y Gln Ser Ile Gln Glu         #     60                                                                      - Leu Pro Leu Phe Phe Thr Phe Pro Pro Pro Gl - #y Asp Thr Trp Glu Leu         # 80                                                                          - Ala Tyr Ile Pro Ser His Ser Asp Ala Ala Ly - #s Ala Val Thr Glu Thr         #                 95                                                          - Val Arg Arg Ala Leu Val Ile Asn Met Arg Va - #l Arg Gly Phe Pro Ser         #           110                                                               - Glu Lys Asp Phe Glu Asp Tyr Ile Arg Tyr As - #p Asn Cys Ser Ser Ser         #       125                                                                   - Val Leu Ala Ala Val Val Phe Glu His Pro Ph - #e Asn His Ser Lys Glu         #   140                                                                       - Pro Leu Pro Leu Ala Val Lys Tyr His Leu Ar - #g Phe Ser Tyr Thr Arg         145                 1 - #50                 1 - #55                 1 -       #60                                                                           - Arg Asn Tyr Met Trp Thr Gln Thr Gly Ser Ph - #e Phe Leu Lys Glu Thr         #               175                                                           - Glu Gly Trp His Thr Thr Ser Leu Phe Pro Le - #u Phe Pro Asn Pro Gly         #           190                                                               - Pro Arg Glu Leu Thr Ser Pro Asp Gly Gly Gl - #u Pro Gly Tyr Ile Arg         #       205                                                                   - Glu Gly Phe Leu Ala Val Gln His Ala Val As - #p Arg Ala Ile Met Glu         #   220                                                                       - Tyr His Ala Asp Ala Ala Thr Arg Gln Leu Ph - #e Gln Arg Leu Thr Val         225                 2 - #30                 2 - #35                 2 -       #40                                                                           - Thr Ile Lys Arg Phe Pro Tyr Pro Pro Phe Il - #e Ala Asp Pro Phe Leu         #               255                                                           - Val Ala Ile Gln Tyr Gln Leu Pro Leu Leu Le - #u Leu Leu Ser Phe Thr         #           270                                                               - Tyr Thr Ala Leu Thr Ile Ala Arg Ala Val Va - #l Gln Glu Lys Glu Arg         #       285                                                                   - Arg Leu Lys Glu Tyr Met Arg Met Met Gly Le - #u Ser Ser Trp Leu His         #   300                                                                       - Trp Ser Ala Trp Phe Leu Leu Phe Phe Leu Ph - #e Leu Leu Ile Ala Ala         305                 3 - #10                 3 - #15                 3 -       #20                                                                           - Ser Phe Met Thr Leu Leu Phe Cys Val Lys Va - #l Lys Pro Asn Val Ala         #               335                                                           - Val Leu Ser Arg Ser Asp Pro Ser Leu Val Le - #u Ala Phe Leu Leu Cys         #           350                                                               - Phe Ala Ile Ser Thr Ile Ser Phe Ser Phe Me - #t Val Ser Thr Phe Phe         #       365                                                                   - Ser Lys Ala Asn Met Ala Ala Ala Phe Gly Gl - #y Phe Leu Tyr Phe Phe         #   380                                                                       - Thr Tyr Ile Pro Tyr Phe Phe Val Ala Pro Ar - #g Tyr Asn Trp Met Thr         385                 3 - #90                 3 - #95                 4 -       #00                                                                           - Leu Ser Gln Lys Leu Cys Ser Cys Leu Leu Se - #r Asn Val Ala Met Ala         #               415                                                           - Met Gly Ala Gln Leu Ile Gly Lys Phe Glu Al - #a Lys Gly Met Gly Ile         #           430                                                               - Gln Trp Arg Asp Leu Leu Ser Pro Val Asn Va - #l Asp Asp Asp Phe Cys         #       445                                                                   - Phe Gly Gln Val Leu Gly Met Leu Leu Leu As - #p Ser Val Leu Tyr Gly         #   460                                                                       - Leu Val Thr Trp Tyr Met Glu Ala Val Phe Pr - #o Gly Gln Phe Gly Val         465                 4 - #70                 4 - #75                 4 -       #80                                                                           - Pro Gln Pro Trp Tyr Phe Phe Ile Met Pro Se - #r Tyr Trp Cys Gly Lys         #               495                                                           - Pro Arg Ala Val Ala Gly Lys Glu Glu Glu As - #p Ser Asp Pro Glu Lys         #           510                                                               - Ala Leu Arg Asn Glu Tyr Phe Glu Ala Glu Pr - #o Glu Asp Leu Val Ala         #       525                                                                   - Gly Ile Lys Ile Lys His Leu Ser Lys Val Ph - #e Arg Val Gly Asn Lys         #   540                                                                       - Asp Arg Ala Ala Val Arg Asp Leu Asn Leu As - #n Leu Tyr Glu Gly Gln         545                 5 - #50                 5 - #55                 5 -       #60                                                                           - Ile Thr Val Leu Leu Gly His Asn Gly Ala Gl - #y Lys Thr Thr Thr Leu         #               575                                                           - Ser Met Leu Thr Gly Leu Phe Pro Pro Thr Se - #r Gly Arg Ala Tyr Ile         #           590                                                               - Ser Gly Tyr Glu Ile Ser Gln Asp Met Val Gl - #n Ile Arg Lys Ser Leu         #       605                                                                   - Gly Leu Cys Pro Gln His Asp Ile Leu Phe As - #p Asn Leu Thr Val Ala         #   620                                                                       - Glu His Leu Tyr Phe Tyr Ala Gln Leu Lys Gl - #y Leu Ser Arg Gln Lys         625                 6 - #30                 6 - #35                 6 -       #40                                                                           - Cys Pro Glu Glu Val Lys Gln Met Leu His Il - #e Ile Gly Leu Glu Asp         #               655                                                           - Lys Trp Asn Ser Arg Ser Arg Phe Leu Ser Gl - #y Gly Met Arg Arg Lys         #           670                                                               - Leu Ser Ile Gly Ile Ala Leu Ile Ala Gly Se - #r Lys Val Leu Ile Leu         #       685                                                                   - Asp Glu Pro Thr Ser Gly Met Asp Ala Ile Se - #r Arg Arg Ala Ile Trp         #   700                                                                       - Asp Leu Leu Gln Arg Gln Lys Ser Asp Arg Th - #r Ile Val Leu Thr Thr         705                 7 - #10                 7 - #15                 7 -       #20                                                                           - His Phe Met Asp Glu Ala Asp Leu Leu Gly As - #p Arg Ile Ala Ile Met         #               735                                                           - Ala Lys Gly Glu Leu Gln Cys Cys Gly Ser Se - #r Leu Phe Leu Lys Gln         #           750                                                               - Lys Tyr Gly Ala Gly Tyr His Met Thr Leu Va - #l Lys Glu Pro His Cys         #       765                                                                   - Asn Pro Glu Asp Ile Ser Gln Leu Val His Hi - #s His Val Pro Asn Ala         #   780                                                                       - Thr Leu Glu Ser Ser Ala Gly Ala Glu Leu Se - #r Phe Ile Leu Pro Arg         785                 7 - #90                 7 - #95                 8 -       #00                                                                           - Glu Ser Thr His Arg Phe Glu Gly Leu Phe Al - #a Lys Leu Glu Lys Lys         #               815                                                           - Gln Lys Glu Leu Gly Ile Ala Ser Phe Gly Al - #a Ser Ile Thr Thr Met         #           830                                                               - Glu Glu Val Phe Leu Arg Val Gly Lys Leu Va - #l Asp Ser Ser Met Asp         #       845                                                                   - Ile Gln Ala Ile Gln Leu Pro Ala Leu Gln Ty - #r Gln His Glu Arg Arg         #   860                                                                       - Ala Ser Asp Trp Ala Val Asp Ser Asn Leu Cy - #s Gly Ala Met Asp Pro         865                 8 - #70                 8 - #75                 8 -       #80                                                                           - Ser Asp Gly Ile Gly Ala Leu Ile Glu Glu Gl - #u Arg Thr Ala Val Lys         #               895                                                           - Leu Asn Thr Gly Leu Ala Leu His Cys Gln Gl - #n Phe Trp Ala Met Phe         #           910                                                               - Leu Lys Lys Ala Ala Tyr Ser Trp Arg Glu Tr - #p Lys Met Val Ala Ala         #       925                                                                   - Gln Val Leu Val Pro Leu Thr Cys Val Thr Le - #u Ala Leu Leu Ala Ile         #   940                                                                       - Asn Tyr Ser Ser Glu Leu Phe Asp Asp Pro Me - #t Leu Arg Leu Thr Leu         945                 9 - #50                 9 - #55                 9 -       #60                                                                           - Gly Glu Tyr Gly Arg Thr Val Val Pro Phe Se - #r Val Pro Gly Thr Ser         #               975                                                           - Gln Leu Gly Gln Gln Leu Ser Glu His Leu Ly - #s Asp Ala Leu Gln Ala         #           990                                                               - Glu Gly Gln Glu Pro Arg Glu Val Leu Gly As - #p Leu Glu Glu Phe Leu         #      10050                                                                  - Ile Phe Arg Ala Ser Val Glu Gly Gly Gly Ph - #e Asn Glu Arg Cys Leu         #  10205                                                                      - Val Ala Ala Ser Phe Arg Asp Val Gly Glu Ar - #g Thr Val Val Asn Ala         #               10401030 - #                1035                              - Leu Phe Asn Asn Gln Ala Tyr His Ser Pro Al - #a Thr Ala Leu Ala Val         #              10550                                                          - Val Asp Asn Leu Leu Phe Lys Leu Leu Cys Gl - #y Pro His Ala Ser Ile         #          10705                                                              - Val Val Ser Asn Phe Pro Gln Pro Arg Ser Al - #a Leu Gln Ala Ala Lys         #      10850                                                                  - Asp Gln Phe Asn Glu Gly Arg Lys Gly Phe As - #p Ile Ala Leu Asn Leu         #  11005                                                                      - Leu Phe Ala Met Ala Phe Leu Ala Ser Thr Ph - #e Ser Ile Leu Ala Val         #               11201110 - #                1115                              - Ser Glu Arg Ala Val Gln Ala Lys His Val Gl - #n Phe Val Ser Gly Val         #              11350                                                          - His Val Ala Ser Phe Trp Leu Ser Ala Leu Le - #u Trp Asp Leu Ile Ser         #          11505                                                              - Phe Leu Ile Pro Ser Leu Leu Leu Leu Val Va - #l Phe Lys Ala Phe Asp         #      11650                                                                  - Val Arg Ala Phe Thr Arg Asp Gly His Met Al - #a Asp Thr Leu Leu Leu         #  11805                                                                      - Leu Leu Leu Tyr Gly Trp Ala Ile Ile Pro Le - #u Met Tyr Leu Met Asn         #               12001190 - #                1195                              - Phe Phe Phe Leu Gly Ala Ala Thr Ala Tyr Th - #r Arg Leu Thr Ile Phe         #              12150                                                          - Asn Ile Leu Ser Gly Ile Ala Thr Phe Leu Me - #t Val Thr Ile Met Arg         #          12305                                                              - Ile Pro Ala Val Lys Leu Glu Glu Leu Ser Ly - #s Thr Leu Asp His Val         #      12450                                                                  - Phe Leu Val Leu Pro Asn His Cys Leu Gly Me - #t Ala Val Ser Ser Phe         #  12605                                                                      - Tyr Glu Asn Tyr Glu Thr Arg Arg Tyr Cys Th - #r Ser Ser Glu Val Ala         #               12801270 - #                1275                              - Ala His Tyr Cys Lys Lys Tyr Asn Ile Gln Ty - #r Gln Glu Asn Phe Tyr         #              12950                                                          - Ala Trp Ser Ala Pro Gly Val Gly Arg Phe Va - #l Ala Ser Met Ala Ala         #          13105                                                              - Ser Gly Cys Ala Tyr Leu Ile Leu Leu Phe Le - #u Ile Glu Thr Asn Leu         #      13250                                                                  - Leu Gln Arg Leu Arg Gly Ile Leu Cys Ala Le - #u Arg Arg Arg Arg Thr         #  13405                                                                      - Leu Thr Glu Leu Tyr Thr Arg Met Pro Val Le - #u Pro Glu Asp Gln Asp         #               13601350 - #                1355                              - Val Ala Asp Glu Arg Thr Arg Ile Leu Ala Pr - #o Ser Pro Asp Ser Leu         #              13750                                                          - Leu His Thr Pro Leu Ile Ile Lys Glu Leu Se - #r Lys Val Tyr Glu Gln         #          13905                                                              - Arg Val Pro Leu Leu Ala Val Asp Arg Leu Se - #r Leu Ala Val Gln Lys         #      14050                                                                  - Gly Glu Cys Phe Gly Leu Leu Gly Phe Asn Gl - #y Ala Gly Lys Thr Thr         #  14205                                                                      - Thr Phe Lys Met Leu Thr Gly Glu Glu Ser Le - #u Thr Ser Gly Asp Ala         #               14401430 - #                1435                              - Phe Val Gly Gly His Arg Ile Ser Ser Asp Va - #l Gly Lys Val Arg Gln         #              14550                                                          - Arg Ile Gly Tyr Cys Pro Gln Phe Asp Ala Le - #u Leu Asp His Met Thr         #          14705                                                              - Gly Arg Glu Met Leu Val Met Tyr Ala Arg Le - #u Arg Gly Ile Pro Glu         #      14850                                                                  - Arg His Ile Gly Ala Cys Val Glu Asn Thr Le - #u Arg Gly Leu Leu Leu         #  15005                                                                      - Glu Pro His Ala Asn Lys Leu Val Arg Thr Ty - #r Ser Gly Gly Asn Lys         #               15201510 - #                1515                              - Arg Lys Leu Ser Thr Gly Ile Ala Leu Ile Gl - #y Glu Pro Ala Val Ile         #              15350                                                          - Phe Leu Asp Glu Pro Ser Thr Gly Met Asp Pr - #o Val Ala Arg Arg Leu         #          15505                                                              - Leu Trp Asp Thr Val Ala Arg Ala Arg Glu Se - #r Gly Lys Ala Ile Ile         #      15650                                                                  - Ile Thr Ser His Ser Met Glu Glu Cys Glu Al - #a Leu Cys Thr Arg Leu         #  15805                                                                      - Ala Ile Met Val Gln Gly Gln Phe Lys Cys Le - #u Gly Ser Pro Gln His         #               16001590 - #                1595                              - Leu Lys Ser Lys Phe Gly Ser Gly Tyr Ser Le - #u Arg Ala Lys Val Gln         #              16150                                                          - Ser Glu Gly Gln Gln Glu Ala Leu Glu Glu Ph - #e Lys Ala Phe Val Asp         #          16305                                                              - Leu Thr Phe Pro Gly Ser Val Leu Glu Asp Gl - #u His Gln Gly Met Val         #      16450                                                                  - His Tyr His Leu Pro Gly Arg Asp Leu Ser Tr - #p Ala Lys Val Phe Gly         #  16605                                                                      - Ile Leu Glu Lys Ala Lys Glu Lys Tyr Gly Va - #l Asp Asp Tyr Ser Val         #               16801670 - #                1675                              - Ser Gln Ile Ser Leu Glu Gln Val Phe Leu Se - #r Phe Ala His Leu Gln         #              16950                                                          - Pro Pro Thr Ala Glu Glu Gly Arg                                                         1700                                                              - (2) INFORMATION FOR SEQ ID NO:76:                                           -      (i) SEQUENCE CHARACTERISTICS:                                          #pairs    (A) LENGTH: 18 base                                                           (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                -     (ii) MOLECULE TYPE: other nucleic acid                                  #= "Oligonucleotide primer"/desc                                              -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:76:                                #  18              CT                                                         - (2) INFORMATION FOR SEQ ID NO:77:                                           -      (i) SEQUENCE CHARACTERISTICS:                                          #acids    (A) LENGTH: 349 amino                                                         (B) TYPE: amino acid                                                          (C) STRANDEDNESS: Not R - #elevant                                            (D) TOPOLOGY: unknown                                               -     (ii) MOLECULE TYPE: protein                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:77:                                - Gly Gln Leu Leu Gly His Asn Gly Ala Gly Ly - #s Thr Thr Ser Ile Gly         #                15                                                           - Arg Pro Thr Gly Ile Gly Tyr Asp Arg Gly Cy - #s Pro Gln Leu Asp Leu         #            30                                                               - Thr Val Glu His Leu Leu Lys Gly Lys Leu Le - #u Lys Asn Leu Ser Gly         #        45                                                                   - Gly Met Arg Lys Leu Gly Leu Asp Glu Pro Th - #r Ala Gly Met Asp Arg         #    60                                                                       - Leu Arg Lys Arg Thr Ile Leu Thr Thr His Me - #t Asp Glu Ala Leu Gly         #80                                                                           - Asp Ile Met His Gly Leu Gly Leu Lys Gln Ly - #s Gly Gly Tyr Thr Val         #                95                                                           - Glu Gln Pro Ala Arg Phe Leu Leu Ser Phe Gl - #y Ser Thr Glu Val Phe         #           110                                                               - Ile Gly Asp His Arg Gly Ala Gln Phe Lys Ly - #s Tyr Ser Arg Trp Gln         #       125                                                                   - Val Leu Pro Leu Asp Leu Thr Glu Val Phe Pr - #o Leu Pro Gly Ala Leu         #   140                                                                       - Phe Asn Tyr His Thr Ser Val Ser Gln Ala Le - #u Ala Ser Thr Phe Glu         145                 1 - #50                 1 - #55                 1 -       #60                                                                           - Arg Gln Ala His Gln Phe Gly Phe Leu Asp Il - #e Ser Leu Leu Phe Asp         #               175                                                           - His Ala Leu Leu Tyr Ser Pro Tyr Phe Phe Al - #a Leu Ile Ala Leu Val         #           190                                                               - Glu Leu Leu Phe Leu Pro Gly Ala Asn Trp Gl - #y Phe Leu Arg Met Leu         #       205                                                                   - Pro Val Glu Arg Arg Asn Leu Ile Lys Leu Ly - #s Ala Val Leu Leu Ala         #   220                                                                       - Val Glu Cys Phe Gly Leu Leu Gly Asn Gly Al - #a Gly Lys Thr Thr Thr         225                 2 - #30                 2 - #35                 2 -       #40                                                                           - Phe Leu Thr Gly Ser Ser Gly Ala Gly Gly As - #p Val Ile Gly Tyr Cys         #               255                                                           - Pro Gln Phe Asp Ala Leu Thr Gly Arg Glu Le - #u Ala Gly Ala Glu Leu         #           270                                                               - His Ala Lys Leu Val Arg Tyr Ser Gly Gly Ly - #s Arg Lys Ser Gly Ala         #       285                                                                   - Leu Leu Pro Gln Ile Leu Asp Glu Pro Gly As - #p Pro Ala Arg Arg Trp         #   300                                                                       - Glu Ser Ala Thr Ser His Ser Met Glu Cys Gl - #u Ala Leu Cys Arg Ala         305                 3 - #10                 3 - #15                 3 -       #20                                                                           - Gly Gly Ser Gln Leu Lys Ser Gly Tyr Val Pr - #o Ser Val Leu Leu Pro         #               335                                                           - Trp Phe Gly Val Asp Gln Ser Leu Glu Phe Le - #u Ala Leu                     #           345                                                               - (2) INFORMATION FOR SEQ ID NO:78:                                           -      (i) SEQUENCE CHARACTERISTICS:                                          #pairs    (A) LENGTH: 1974 base                                                         (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                -     (ii) MOLECULE TYPE: cDNA                                                -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:78:                                - CAGCGGGAGG ACGCGCCAAC ATCCCCGCTG CTGTGCTGGG CCCGGGGCGT GC - #CCGCCGCT         60                                                                          - GCTCCCACCT CTGGGCCGGG CTGGGGCCGC CCGGGGGCCC TGTTCCTCGG CA - #TTGCGGGC        120                                                                          - CTGGTGGGCA GAACCGCGGA GAGGGCTTCT TTTCCCCAAG GGCAGCGTCT TG - #GGGCCCGG        180                                                                          - CCACTGGCTG ACCCGCAGCG GCTCCGGCCA TGCCTGGCTG GCCCTGGGGG CT - #GCTGCTGA        240                                                                          - CGGCAGGCAC GCTCTTCGCC GCCCTGAGTC CTGGGCCGCC GGCGCCCGCC GA - #CCCCTGCC        300                                                                          - ACGATGAGGG GGGTGCGCCC CGCGGCTGCG TGCCAGGACT GGTGAACGCC GC - #CCTGGGCC        360                                                                          - GCGAGGTGCT GGCTTCCAGC ACGTGCGGGC GGCCGGCCAC TCGGGCCTGC GA - #CGCCTCCG        420                                                                          - ACCCGCGACG GGCACACTCC CCCGCCCTCC TTACTTCCCC AGGGGGCACG GC - #CAGCCCTC        480                                                                          - TGTGCTGGCG CTCGGAGTCC CTGCCTCGGG CGCCCCTCAA CGTGACTCTC AC - #GGTGCCCC        540                                                                          - TGGGCAAGGC TTTTGAGCTG GTCTTCGTGA GCCTGCGCTT CTGCTCAGCT CC - #CCCAGCCT        600                                                                          - CCGTGGCCCT GCTCAAGTCT CAGGACCATG GCCGCAGCTG GGCCCCGCTG GG - #CTTCTTCT        660                                                                          - CCTCCCACTG TGACCTGGAC TATGGCCGTC TGCCTGCCCC TGCCAATGGC CC - #AGCTGGCC        720                                                                          - CAGGGCCTGA GGCCCTGTGC TTCCCCGCAC CCCTGGCCCA GCCTGATGGC AG - #CGGCCTTC        780                                                                          - TGGCCTTCAG CATGCAGGAC AGCAGCCCCC CAGGCCTGGA CCTGGACAGC AG - #CCCAGTGC        840                                                                          - TCCAAGACTG GGTGACCGCC ACCGACGTCC GTGTAGTGCT CACAAGGCCT AG - #CACGGCAG        900                                                                          - GTGACCCCAG GGACATGGAG GCCGTCGTCC CTTACTCCTA CGCAGCCACC GA - #CCTCCAGG        960                                                                          - TGGGCGGGCG CTGCAAGTGC AATGGACATG CCTCACGGTG CCTGCTGGAC AC - #ACAGGGCC       1020                                                                          - ACCTGATCTG CGACTGTCGG CATGGCACCG AGGGCCCTGA CTGCGGCCGC TG - #CAAGCCCT       1080                                                                          - TCTACTGCGA CAGGCCATGG CAGCGGGCCA CTGCCCGGGA ATCCCACGCC TG - #CCTCGCTT       1140                                                                          - GCTCCTGCAA CGGCCATGCC CGCCGCTGCC GCTTCAACAT GGAGCTGTAC CG - #ACTGTCCG       1200                                                                          - GCCGCCGCAG CGGGGGTGTC TGTCTCAACT GCCGGCACAA CACCGCCGGC CG - #CCACTGCC       1260                                                                          - ACTACTGCCG GGAGGGCTTC TATCGAGACC CTGGCCGTGC CCTGAGTGAC CG - #TCGGGCTT       1320                                                                          - GCAGGGCCTG CGACTGTCAC CCGGTTGGTG CTGCTGGCAA GACCTGCAAC CA - #GACCACAG       1380                                                                          - GCCAGTGTCC CTGCAAGGAT GGCGTCACTG GCCTCACCTG CAACCGCTGC GC - #GCCTGGCT       1440                                                                          - TCCAGCAAAG CCGCTCCCCA GTGGCGCCCT GTGTTAAGAC CCCTATCCCT GG - #ACCCACTG       1500                                                                          - AGGACAGCAG CCCTGTGCAG CCCCAGGACT GTGACTCGCA CTGCAAACCT GC - #CCGTGGCA       1560                                                                          - GCTACCGCAT CAGCCTAAAG AAGTTCTGCA AGAAGGACTA TGCGGTGCAG GT - #GGCGGTGG       1620                                                                          - GTGCGCGCGG CGAGGCGCGC GGCGCGTGGA CACGCTTCCC GGTGGCGGTG CT - #CGCCGTGT       1680                                                                          - TCCGGAGCGG AGAGGAGCGC GCGCGGCGCG GGAGTAGCGC GCTGTGGGTG CC - #CGCCGGGG       1740                                                                          - ATGCGGCCTG CGGCTGCCCG CGCCTGCTCC CCGGCCGCCG CTACCTCCTG CT - #GGGGGGCG       1800                                                                          - GGCCTGGAGC CGCGGCTGGG GGCGCGGGGG GCCGGGGGCC CGGGCTCATC GC - #CGCCCGCG       1860                                                                          - GAAGCCTCGT GCTACCCTGG AGGGACGCGT GGACGCGGCG CCTGCGGAGG CT - #GCAGCGAC       1920                                                                          - GCGAACGGCG GGGGCGCTGC AGCGCCGCCT GAGCCCGCCG GCTGGGCAAG GC - #GC             1974                                                                          - (2) INFORMATION FOR SEQ ID NO:79:                                           -      (i) SEQUENCE CHARACTERISTICS:                                          #acids    (A) LENGTH: 612 amino                                                         (B) TYPE: amino acid                                                          (C) STRANDEDNESS: Not R - #elevant                                            (D) TOPOLOGY: unknown                                               -     (ii) MOLECULE TYPE: protein                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:79:                                - Met Ile Thr Ser Val Leu Arg Tyr Val Leu Al - #a Leu Tyr Phe Cys Met         #                15                                                           - Gly Ile Ala His Gly Ala Tyr Phe Ser Gln Ph - #e Ser Met Arg Ala Pro         #            30                                                               - Asp His Asp Pro Cys His Asp His Thr Gly Ar - #g Pro Val Arg Cys Val         #        45                                                                   - Pro Glu Phe Ile Asn Ala Ala Phe Gly Lys Pr - #o Val Ile Ala Ser Asp         #    60                                                                       - Thr Cys Gly Thr Asn Arg Pro Asp Lys Tyr Cy - #s Thr Val Lys Glu Gly         #80                                                                           - Pro Asp Gly Ile Ile Arg Glu Gln Cys Asp Th - #r Cys Asp Ala Arg Asn         #                95                                                           - His Phe Gln Ser His Pro Ala Ser Leu Leu Th - #r Asp Leu Asn Ser Ile         #           110                                                               - Gly Asn Met Thr Cys Trp Val Ser Thr Pro Se - #r Leu Ser Pro Gln Asn         #       125                                                                   - Val Ser Leu Thr Leu Ser Leu Gly Lys Lys Ph - #e Glu Leu Thr Tyr Val         #   140                                                                       - Ser Met His Phe Cys Ser Arg Leu Pro Asp Se - #r Met Ala Leu Tyr Lys         145                 1 - #50                 1 - #55                 1 -       #60                                                                           - Ser Ala Asp Phe Gly Lys Thr Trp Thr Pro Ph - #e Gln Phe Tyr Ser Ser         #               175                                                           - Glu Cys Arg Arg Ile Phe Gly Arg Asp Pro As - #p Val Ser Ile Thr Lys         #           190                                                               - Ser Asn Glu Gln Glu Ala Val Cys Thr Ala Se - #r His Ile Met Gly Pro         #       205                                                                   - Gly Gly Asn Arg Val Ala Phe Pro Phe Leu Gl - #u Asn Arg Pro Ser Ala         #   220                                                                       - Gln Asn Phe Glu Asn Ser Pro Val Leu Gln As - #p Trp Val Thr Ala Thr         225                 2 - #30                 2 - #35                 2 -       #40                                                                           - Asp Ile Lys Val Val Phe Ser Arg Leu Ser Pr - #o Asp Gln Ala Glu Leu         #               255                                                           - Tyr Gly Leu Ser Asn Asp Val Asn Ser Tyr Gl - #y Asn Glu Thr Asp Asp         #           270                                                               - Glu Val Lys Gln Arg Tyr Phe Tyr Ser Met Gl - #y Glu Leu Ala Val Gly         #       285                                                                   - Gly Arg Cys Lys Cys Asn Gly His Ala Ser Ar - #g Cys Ile Phe Asp Lys         #   300                                                                       - Met Gly Arg Tyr Thr Cys Asp Cys Lys His As - #n Thr Ala Gly Thr Glu         305                 3 - #10                 3 - #15                 3 -       #20                                                                           - Cys Glu Met Cys Lys Pro Phe His Tyr Asp Ar - #g Pro Trp Gly Arg Ala         #               335                                                           - Thr Ala Asn Ser Ala Asn Ser Cys Val Ala Cy - #s Asn Cys Asn Gln His         #           350                                                               - Ala Lys Arg Cys Arg Phe Asp Ala Glu Leu Ph - #e Arg Leu Ser Gly Asn         #       365                                                                   - Arg Ser Gly Gly Val Cys Leu Asn Cys Arg Hi - #s Asn Thr Ala Gly Arg         #   380                                                                       - Asn Cys His Leu Cys Lys Pro Gly Phe Val Ar - #g Asp Thr Ser Leu Pro         385                 3 - #90                 3 - #95                 4 -       #00                                                                           - Met Thr His Arg Arg Ala Cys Lys Ser Cys Gl - #y Cys His Pro Val Gly         #               415                                                           - Ser Leu Gly Lys Ser Cys Asn Gln Ser Ser Gl - #y Gln Cys Val Cys Lys         #           430                                                               - Pro Gly Val Thr Gly Thr Thr Cys Asn Arg Cy - #s Ala Lys Gly Tyr Gln         #       445                                                                   - Gln Ser Arg Ser Thr Val Thr Pro Cys Ile Ly - #s Ile Pro Thr Lys Ala         #   460                                                                       - Asp Phe Ile Gly Ser Ser His Ser Glu Glu Gl - #n Asp Gln Cys Ser Lys         465                 4 - #70                 4 - #75                 4 -       #80                                                                           - Cys Arg Ile Val Pro Lys Arg Leu Asn Gln Ly - #s Lys Phe Cys Lys Arg         #               495                                                           - Asp His Ala Val Gln Met Val Val Val Ser Ar - #g Glu Met Val Asp Gly         #           510                                                               - Trp Ala Lys Tyr Lys Ile Val Val Glu Ser Va - #l Phe Lys Arg Thr Glu         #       525                                                                   - Asn Met Gln Arg Arg Gly Glu Thr Ser Leu Tr - #p Ile Ser Pro Gln Gly         #   540                                                                       - Val Ile Cys Lys Cys Pro Lys Leu Arg Val Gl - #y Arg Arg Tyr Leu Leu         545                 5 - #50                 5 - #55                 5 -       #60                                                                           - Leu Gly Lys Asn Asp Ser Asp His Glu Arg As - #p Gly Leu Met Val Asn         #               575                                                           - Pro Gln Thr Val Leu Val Glu Trp Glu Asp As - #p Ile Met Asp Lys Val         #           590                                                               - Leu Arg Phe Ser Lys Lys Asp Lys Leu Gly Gl - #n Cys Pro Glu Ile Thr         #       605                                                                   - Ser His Arg Tyr                                                                 610                                                                       - (2) INFORMATION FOR SEQ ID NO:80:                                           -      (i) SEQUENCE CHARACTERISTICS:                                          #pairs    (A) LENGTH: 17 base                                                           (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                -     (ii) MOLECULE TYPE: other nucleic acid                                  #= "Oligonucleotide primer /desc                                                             sense str - #and"                                              -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:80:                                #   17             C                                                          - (2) INFORMATION FOR SEQ ID NO:81:                                           -      (i) SEQUENCE CHARACTERISTICS:                                          #pairs    (A) LENGTH: 17 base                                                           (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                -     (ii) MOLECULE TYPE: other nucleic acid                                  #= "Oligonucleotide primer /desc                                              #strand"       antisense                                                      -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:81:                                #   17             C                                                          - (2) INFORMATION FOR SEQ ID NO:82:                                           -      (i) SEQUENCE CHARACTERISTICS:                                          #pairs    (A) LENGTH: 17 base                                                           (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                -     (ii) MOLECULE TYPE: other nucleic acid                                  #= "Oligonucleotide primer /desc                                                             sense str - #and"                                              -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:82:                                #   17             G                                                          - (2) INFORMATION FOR SEQ ID NO:83:                                           -      (i) SEQUENCE CHARACTERISTICS:                                          #pairs    (A) LENGTH: 17 base                                                           (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                -     (ii) MOLECULE TYPE: other nucleic acid                                  #= "Oligonucleotide primer /desc                                              #strand"       antisense                                                      -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:83:                                #   17             G                                                          __________________________________________________________________________

What is claimed is:
 1. An isolated nucleic acid separated from itsnative in vivo cellular environment that encodes a human netrinpolypeptide, and that hybridizes to the complement of the nucleic acidof SEQ ID NO: 19 under conditions of high stringency, 0.1×SSC/0.1% SDSand 65° C.
 2. An isolated nucleic acid of claim 1, wherein said nucleicacid is an isolated mRNA molecule.
 3. An isolated nucleic acid of claim1, wherein said nucleic acid is an isolated DNA molecule comprising thesequence set forth in SEQ ID NO:
 19. 4. An isolated nucleic acid ofclaim 1, wherein said nucleic acid is an isolated DNA moleculecomprising the sequence encoding the amino acid sequence set forth inSEQ ID NO:
 21. 5. An isolated nucleic acid encoding human netrin,wherein said nucleic acid is an isolated DNA molecule comprising thesequence set forth in SEQ ID NO:
 78. 6. Isolated nucleic acid accordingto claim 1, comprising the sequence: 5'-GCCTGTCATCGCTCTAG-3' (SEQ IDNO:59).
 7. Isolated nucleic acid according to claim 1, comprising thesequence: 5'-CAGTCGCAGGCCCTGCA-3' (SEQ ID NO:60).
 8. Isolated nucleicacid according to claim 1, comprising the sequence:5'-GAGGACGCGCCAACATC-3' (SEQ ID NO:61).
 9. Isolated nucleic acidaccording to claim 1, comprising the sequence: 5'-CGGCAGTAGTGGCAGTG-3'(SEQ ID NO:62).
 10. Isolated nucleic acid according to claim 1,comprising the sequence: 5'-CCTGCCTCGCTTGCTCCTGC-3' (SEQ ID NO:63). 11.Isolated nucleic acid according to claim 1, comprising the sequence:5'-CGGGCAGCCGCAGGCCGCAT-3' (SEQ ID NO:64).
 12. Isolated nucleic acidaccording to claim 1, comprising the sequence:5'-CCTGCAACGGCCATGCCCGC-3' (SEQ ID NO:65).
 13. Isolated nucleic acidaccording to claim 1, comprising the sequence:5'-GCATCCCCGGCGGGCACCCA-3' (SEQ ID NO:66).
 14. Isolated nucleic acidaccording to claim 1, comprising the sequence: 5'-CTTGCAGGGCCTGCGAC-3'(SEQ ID NO:80).
 15. Isolated nucleic acid according to claim 1,comprising the sequence 5'-GAAGGCACAGGGTGAAC-3' (SEQ ID NO:81). 16.Isolated nucleic acid according to claim 1, comprising the sequence5'-CTGCAACCAGACCACAG-3' (SEQ ID NO:82).
 17. Isolated nucleic acidaccording to claim 1, comprising the sequence 5'-TAGATGTGGGAGCAGCG-3'(SEQ ID NO:83).
 18. A vector comprising the isolated nucleic acid ofclaim
 1. 19. An isolated host cell comprising the vector of claim 18.20. A method for producing human netrin protein, said methodcomprising:(a) culturing the host cell of claim 19 in a medium and underconditions suitable for expression of said protein, and (b) isolatingsaid expressed protein from the host cell.
 21. A method for identifyingcompounds which bind to human netrin (hNET) polypeptide, said methodcomprising a competitive binding assay further comprising: a) culturingthe cells according to claim 19 under conditions so that the humannetrin polypeptide is produced by the cells; and b) exposing the cellsto a plurality of compounds and identifying compounds which bind humannetrin polypeptide.
 22. An isolated host cell of claim 19, wherein thecell is a procaryotic cell.
 23. The method of claim 20, wherein the hostcell is a procaryotic cell.
 24. The method of claim 21, wherein the hostcell is a procaryotic cell.
 25. A composition comprising the isolatednucleic acid of any of claims 2-17 and a carrier.